TFT

Unicode Confusables Detector

Find visually similar Unicode characters that can trick users. Detect homoglyphs in domains, usernames, or text to prevent spoofing attacks.

Unicode Confusables & Homoglyph Detector

Detect visually similar characters that can be used for phishing or spoofing

Useful for detecting homoglyph attacks in URLs, usernames, and domain names

What are Confusables?

Confusables (homoglyphs) are characters that look similar but have different Unicode code points. They can be used in phishing attacks to create deceptive URLs or usernames.

High Risk Examples
а (Cyrillic) vs a (Latin) | е (Cyrillic) vs e (Latin) | о (Cyrillic) vs o (Latin)
Common Attack
"аррӏе.com" (with Cyrillic characters) vs "apple.com"

How the Unicode Confusables Detector Works

Enter text to analyze for confusable characters. The tool checks each character against the Unicode confusables database. Identifies characters that look similar to others.

Confusables include homoglyphs: Latin 'A' and Cyrillic 'А'. Different code points, identical appearance. The detector shows all confusable alternatives for each character.

Security analysis highlights potential spoofing risks. See which characters could be confused in your text. Essential for security auditing and input validation.

When You'd Actually Use This

Detecting phishing attempts

URLs with confusable characters? 'раураl.com' looks like 'paypal.com'. Detect homoglyph attacks. Protect users from phishing.

Validating usernames

Prevent confusing usernames. 'admin' vs 'аdmin' (Cyrillic a). Reject confusables. Avoid impersonation risks.

Security auditing

Audit user input for confusables. Identify potential spoofing. Security best practice for forms and APIs.

Domain name analysis

Check domains for confusables. Homoglyph domains used for phishing. Identify suspicious registrations.

Code review for security

Variable names with confusables? Could be malicious. Audit code for homoglyphs. Prevent supply chain attacks.

Learning about Unicode security

Understand confusable attacks. See real examples. Educational tool for security awareness.

What to Know Before Using

Confusables aren't always malicious.Some are legitimate (accented letters). Context matters. Don't flag all confusables as attacks.

Many scripts have confusables.Latin/Cyrillic/Greek most common. But also CJK, Arabic, others. Any similar-looking characters can confuse.

Some confusables are intentional.Stylistic choices, brand names. Not all confusables are attacks. Consider legitimate use cases.

Unicode has a confusables file.confusables.txt defines known confusables. This tool uses that database. Authoritative source from Unicode Consortium.

Pro tip: For security-critical input (passwords, domains), restrict to ASCII or specific scripts. Don't allow mixed scripts where confusables are possible.

Common Questions

What are confusable characters?

Characters that look similar or identical. Latin 'A' (U+0041) and Cyrillic 'А' (U+0410). Different code points, same appearance.

How are confusables used in attacks?

Homoglyph attacks replace characters. 'paypal' becomes 'раураl'. Users can't tell the difference. Phishing and impersonation.

How do I prevent confusable attacks?

Validate input scripts. Restrict to expected characters. Use confusable detection. Educate users about the risk.

Are all confusables security risks?

No. Accented letters are confusables but legitimate. Context determines risk. Security-critical fields need stricter validation.

What's a homoglyph?

Characters that look the same. From Greek 'homo' (same) + 'glyph'. Confusables are homoglyphs from different scripts.

Can confusables affect search?

Yes. Search for 'cafe' won't find 'саfе' (Cyrillic). Different code points. Search engines handle this differently.

How do browsers handle confusables?

Modern browsers punycode IDN domains with confusables. Shows 'xn--' in address bar. Warns users about suspicious domains.