TFT

UTF-7 Encoder & Decoder

Convert text to and from UTF-7 encoding, a legacy format for representing Unicode in ASCII-compatible 7-bit systems. Mainly used for historical or compatibility testing.

UTF-7 Encoder/Decoder

Encode text to UTF-7 format or decode UTF-7 back to text

About UTF-7

UTF-7 (7-bit Unicode Transformation Format) is a variable-length character encoding designed to encode Unicode text using only 7-bit ASCII characters. It was originally intended for email systems that couldn't handle 8-bit data.

ASCII characters remain unencoded, while non-ASCII characters are encoded using a modified Base64 encoding prefixed with + and optionally suffixed with -. Note that UTF-7 has known security vulnerabilities and is generally not recommended for new applications.

How the UTF-7 Encoder/Decoder Works

Enter text to encode to UTF-7, or paste UTF-7 encoded text to decode. The tool processes the conversion with proper handling of shifted sequences.

UTF-7 encodes Unicode using only 7-bit ASCII characters. ASCII characters pass through unchanged. Non-ASCII uses Base64-like encoding between + and - markers.

The encoder shifts to Base64 mode for non-ASCII, then shifts back. The decoder detects shift sequences and converts back to Unicode. Handles optional direct characters.

When You'd Actually Use This

Legacy email system support

Old email systems only handle 7-bit ASCII. UTF-7 enables Unicode in these systems. Rarely needed today but still in some legacy environments.

IMAP folder name encoding

IMAP uses modified UTF-7 for folder names. International mailbox names need encoding. Decode folder names for display.

Security research

UTF-7 has known security issues. Research encoding-based attacks. Understand why UTF-7 is deprecated in many contexts.

Protocol analysis

Some protocols historically used UTF-7. Decode captured traffic. Understand legacy protocol behavior.

Encoding education

UTF-7 demonstrates 7-bit Unicode encoding. Educational value for understanding encoding challenges. Historical interest in encoding evolution.

Working with old systems

Maintaining legacy applications? May encounter UTF-7. Decode for modern processing. Migrate to UTF-8 when possible.

What to Know Before Using

UTF-7 is deprecated.Modern systems use UTF-8. UTF-7 has security vulnerabilities. Only use when required by legacy systems.

Shift characters mark encoding.+ starts Base64 shift, - ends it. + itself must be encoded as +- . Shift mechanism enables ASCII passthrough.

Modified Base64 is used.UTF-7 uses Base64 without padding. Alphabet differs slightly from standard Base64. Optimized for Unicode encoding.

Direct characters vary.Set A: always direct (alphanumeric). Set B: direct in some contexts. Set D: direct in UTF-7 but not IMAP. Know your variant.

Pro tip: Never use UTF-7 for web content. Browsers may interpret it differently, enabling XSS attacks. Always use UTF-8 for web applications.

Common Questions

What is UTF-7?

7-bit Unicode encoding. Uses ASCII for common characters, Base64 for others. Designed for email systems that couldn't handle 8-bit data.

Why was UTF-7 created?

Early email systems were 7-bit only. UTF-7 enabled Unicode in these systems. Solution before 8-bit-clean systems were universal.

Is UTF-7 still used?

Rarely. IMAP folder names use modified UTF-7. Some legacy systems. Deprecated for web and modern applications. Use UTF-8 instead.

What are the security issues?

Multiple interpretations possible. Can bypass filters. XSS attacks via UTF-7 encoding. Modern browsers block UTF-7 by default.

How does UTF-7 compare to UTF-8?

UTF-8 is more efficient and secure. UTF-7 is 7-bit safe but complex. UTF-8 is the universal standard today.

What's IMAP UTF-7?

Modified UTF-7 for IMAP folder names. Uses & instead of +. Different direct character set. Specific to IMAP protocol.

Should I use UTF-7?

Only if required by legacy systems. For new development, always use UTF-8. UTF-7 is obsolete for most purposes.