TFT

Encode or Decode XML Entities

Safely encode special characters like <, >, & into XML entities for valid XML. Or decode entities back to their original characters.

Output will appear here...

&

&amp;

<

&lt;

>

&gt;

"

&quot;

'

&apos;

About XML Entities

XML entities are special characters that must be escaped in XML documents. The five predefined entities are: ampersand (&), less-than (<), greater-than (>), double quote ("), and apostrophe ('). Use encoding to safely include these characters in XML content.

How It Works

This XML entity encoder/decoder converts special characters to their XML entity equivalents and vice versa. It ensures your text is safe for inclusion in XML documents without breaking the structure.

The encoding process:

  1. Scan input: Text is scanned for characters that have special meaning in XML.
  2. Replace with entities: Special characters are replaced with their entity equivalents (& becomes &amp;, < becomes &lt;, etc.).
  3. Output safe text: The result is safe to include in XML element content or attributes.

Decoding reverses the process, converting entities back to their original characters. The tool handles both named entities (&amp;, &lt;, &gt;, &quot;, &apos;) and numeric character references (&#60;, &#x3C;).

When You'd Actually Use This

Including code examples in XML documentation

Your XML docs need to show code with < and > characters. Encode them so the XML parser doesn't confuse them with tags.

Storing user-generated content in XML

User comments or posts may contain special characters. Encode before storing in XML to prevent parsing errors or injection issues.

Creating XML with mathematical formulas

Math expressions use <, >, and &. Encode these characters so formulas display correctly when the XML is rendered.

Debugging malformed XML errors

Getting 'invalid character' errors? Decode the XML to see if unescaped special characters are causing the parsing failure.

Preparing data for RSS/Atom feeds

Feed item descriptions often contain HTML or special chars. Encode content properly to ensure feed validators pass.

Working with legacy XML systems

Older systems may not handle CDATA sections. Encode special characters the traditional way for maximum compatibility.

What to Know Before Using

Five predefined entities in XML

XML defines five named entities: &amp;amp; (ampersand), &amp;lt; (less than), &amp;gt; (greater than), &amp;quot; (quote), &amp;apos; (apostrophe).

CDATA sections avoid encoding

Wrap text in <![CDATA[...]]> to include special characters without encoding. But CDATA can't contain ]]> sequence.

Attributes need different handling

In attributes, quotes must be encoded (&amp;quot; or &amp;apos;). Element content doesn't need quote encoding unless quotes appear there.

Numeric entities work everywhere

&#60; (decimal) and &#x3C; (hex) both represent <. Numeric entities work even when named entities aren't recognized.

Double encoding breaks data

Encoding already-encoded text creates &amp;amp;amp; instead of &amp;amp;. Check if text is already encoded before running again.

Common Questions

What's the difference between encoding and escaping?

In XML context, they're the same thing—replacing special characters with entity references. 'Escaping' is more common in programming contexts.

Do I need to encode numbers and letters?

No. Only the five special characters (&, <, >, ", ') need encoding. Regular text, numbers, and most symbols are safe as-is.

When should I use CDATA instead of encoding?

Use CDATA for large blocks of text with many special characters (like code or HTML). Use encoding for short snippets or when CDATA isn't supported.

Can I encode all characters as numeric entities?

Yes, but it's verbose. &#60;hello&#62; works but is harder to read than &lt;hello&gt;. Use numeric entities mainly for non-ASCII characters.

Why does & need to be encoded?

& starts entity references in XML. An unencoded & looks like the start of an entity. Encode as &amp; to include a literal ampersand.

How do I handle non-ASCII characters?

UTF-8 encoded XML handles most characters directly. For special cases, use numeric entities like &#233; for é or &#x416; for Cyrillic Ж.

What happens if I don't encode special characters?

XML parsers will fail with errors like 'invalid character' or 'unexpected token'. The document becomes malformed and unreadable.