Whitespace Characters in Unicode: Spaces You Didn't Know Existed
Published March 15, 2025
More than just the space bar
Most developers think of whitespace as the space character, tab, and newline. But Unicode defines over 25 whitespace and space-related characters, each with a specific typographic or functional purpose. Understanding them helps you debug mysterious formatting issues, write better text processing code, and produce cleaner typography.
Common whitespace characters
| Name | Code point | HTML | Notes |
|---|---|---|---|
| Space | U+0020 | Standard word separator | |
| No-Break Space | U+00A0 | | Prevents line break |
| Character Tabulation | U+0009 | 	 | Horizontal tab |
| Line Feed | U+000A | | Unix newline |
| Carriage Return | U+000D | | Part of Windows CRLF |
Typographic spaces
Unicode includes spaces of specific widths, inherited from traditional typesetting. Their names refer to the metal type units used in letterpress printing:
| Name | Code point | Width |
|---|---|---|
| En Space | U+2002 | Width of the letter “n” |
| Em Space | U+2003 | Width of the letter “M” |
| Three-Per-Em Space | U+2004 | 1/3 of an em |
| Four-Per-Em Space | U+2005 | 1/4 of an em |
| Six-Per-Em Space | U+2006 | 1/6 of an em |
| Figure Space | U+2007 | Width of a digit (for aligning numbers) |
| Punctuation Space | U+2008 | Width of a period |
| Thin Space | U+2009 | 1/5 of an em (roughly) |
| Hair Space | U+200A | Thinnest typographic space |
Zero-width characters
These characters take up no visible space but affect text processing and rendering:
| Name | Code point | Purpose |
|---|---|---|
| Zero Width Space | U+200B | Marks a possible line break point without visible space |
| Zero Width Non-Joiner | U+200C | Prevents ligature formation in scripts like Arabic and Devanagari |
| Zero Width Joiner | U+200D | Requests ligature or cursive joining; used in emoji ZWJ sequences |
| Word Joiner | U+2060 | Prevents line break (like but zero-width) |
The Zero Width Joiner (ZWJ) has become especially well-known through emoji. A ZWJ between two emoji requests that the platform render them as a single combined glyph, enabling sequences like 👩💻 (woman technologist = 👩 + ZWJ + 💻).
Visual comparison
Different spaces have different widths. Here each space type is shown between vertical bars so you can compare:
| Name | Visual width |
|---|---|
| Regular Space | | | |
| No-Break Space | | | |
| En Space | | | |
| Em Space | | | |
| Thin Space | | | |
| Hair Space | | | |
| Zero Width Space | || |
When each is useful
- Non-breaking space (
): keep “100 km” or “Dr. Smith” on the same line - Thin space: used as a thousands separator in French (1 000 000) and between values and units in SI notation (9.8 m/s²)
- Figure space: aligning columns of numbers in plain text, since each figure space equals the width of a digit
- Zero-width space: adding line break opportunities in long URLs or paths without visible changes
- Word joiner: preventing line breaks at specific points (e.g., keeping a currency symbol attached to a number: $100)
Debugging invisible characters
Invisible characters cause some of the most frustrating bugs — strings that look identical but don't match, or text that mysteriously won't parse. Here are strategies for finding them:
- Show whitespace in your editor (VS Code: “Render Whitespace: all”, Vim:
:set list) - Check string length: if
str.lengthis longer than the visible characters, hidden characters are present - Log code points: in JavaScript, use
[...str].map(c => c.codePointAt(0).toString(16))to see every character as a hex code point - Use regex: match zero-width characters with
/[\u200B-\u200D\u2060\uFEFF]/g