Unicode Normalizer
Normalize Unicode text to ensure consistency. Convert between NFC, NFD, NFKC, and NFKD forms for reliable text comparison and storage.
Unicode Normalizer
Normalize Unicode text to any of the four standard normalization forms
Normalization Forms
How it works
This tool normalizes Unicode text to standard forms (NFC, NFD, NFKC, NFKD). Unicode allows multiple ways to represent the same character—either as a single code point or as a base character plus combining marks.
Normalization converts text to a consistent representation. NFC composes characters (single code point where possible). NFD decomposes them (base + combining marks). NFKC and NFKD also apply compatibility mappings.
Normalization forms:
NFCComposed form (default for most uses)NFDDecomposed form (useful for searching)NFKC/DCompatibility forms (normalizes ligatures, etc.)Paste text and select a normalization form. The tool shows the before and after code points so you can see exactly what changed.
When you'd actually use this
Fixing string comparison failures
A developer's code says "é" doesn't equal "é". One is U+00E9 (composed), the other is e + U+0301 (decomposed). Normalizing both to NFC makes them match.
Cleaning user input for storage
A database stores user names inconsistently—some with composed accents, some decomposed. Normalizing all input to NFC ensures consistent storage and reliable lookups.
Implementing search functionality
A search feature should find "café" whether users type composed or decomposed. The developer normalizes both the index and queries to NFD for consistent matching.
Processing text from multiple sources
A data pipeline ingests text from various systems—some use NFC, some NFD. Normalizing everything to one form prevents downstream comparison and sorting issues.
Handling compatibility characters
Text contains ligatures like "fi" (U+FB01) that should match "fi". NFKC normalization converts compatibility characters to their canonical equivalents for consistent processing.
Debugging Unicode edge cases
A QA engineer investigates why two visually identical strings compare differently. They normalize and compare code points to find the hidden difference.
What to know before using it
NFC is the recommended default.W3C and Unicode recommend NFC for web content. It's the most compact form and what most systems expect. Use NFC unless you have a specific reason for another form.
Normalization can change string length.NFD expands characters—é (1 code point) becomes e + combining accent (2 code points). This affects length calculations and buffer allocations.
NFKC/D loses information.Compatibility normalization converts "fi" to "fi" and superscript ² to regular 2. This is irreversible. Only use NFKC/D when you want to lose compatibility distinctions.
Some strings can't be normalized to match.Different characters that look similar (homoglyphs) won't normalize to the same form. Latin 'A' and Cyrillic 'А' remain different after normalization.
Pro tip: Always normalize before comparing strings for equality. But normalize both strings the same way. Comparing NFC to NFD will still fail even though they represent the same text.
Common questions
What's the difference between NFC and NFD?
NFC composes characters where possible (é as U+00E9). NFD decomposes them (e + U+0301 combining acute). Both display identically but have different code points.
When should I use NFKC or NFKD?
Use NFKC/D when you want to normalize compatibility variants— ligatures, full-width characters, superscripts. But be aware it's lossy. Don't use it for text that needs to preserve exact formatting.
Does normalization affect emoji?
Most emoji aren't affected. However, emoji with skin tone modifiers or ZWJ sequences may be normalized. Flag emoji (regional indicator pairs) stay as two code points.
How do I normalize in code?
JavaScript: str.normalize('NFC'). Python: unicodedata.normalize('NFC', str). Java: Normalizer.normalize(str, NFC). Most languages have built-in normalization support.
Can normalization break text?
NFC and NFD are reversible and won't break text. NFKC/D can change meaning by converting compatibility characters. Use NFKC/D carefully and only when you understand the implications.
Why do I need Unicode normalization?
Without normalization, visually identical text can compare as different. This causes bugs in search, sorting, and data deduplication. Normalization ensures consistent representation.
Is normalization slow?
Modern normalization is fast. For most applications, the overhead is negligible. It's worth the cost to avoid Unicode comparison bugs. Batch normalize on input rather than every comparison for best performance.
Other Free Tools
Unicode Character Lookup
Unicode Character Lookup
Unicode Text Converter
Unicode Text Converter
Unicode Character Counter
Unicode Character Counter
Unicode Whitespace Remover
Unicode Whitespace Remover
ASCII to Hex Converter
ASCII to Hex Converter: Text to Hexadecimal Translator
Barcode Generator
Free Barcode Generator
Binary to Text Converter
Binary to Text Converter
Free Printable Calendar Maker
Create & Print Your Custom Calendar
Pie Chart Maker
Free Pie Chart Maker Online