Many developers encounter mysterious bugs when copying text from documentation, Slack messages, Word documents, Notion pages, or AI tools like ChatGPT. The real cause is often invisible Unicode characters.
These characters are not visible on screen but can break JSON parsing, APIs, database queries, authentication tokens, and application logic.
Below is a developer reference for common Unicode characters that silently cause bugs.
The Zero Width Space is an invisible character used for word breaking. It often appears when copying text from messaging apps or rich text editors.
Why it breaks code:
Hello​World
The invisible character between the words is U+200B.
text.replace(/\u200B/g, "")
The Zero Width Non Joiner prevents characters from joining in certain writing systems. However it can accidentally appear in copied code or identifiers.
Why it breaks code:
text.replace(/\u200C/g, "")
The Zero Width Joiner is used in typography and emoji composition. While useful in language rendering, it can appear accidentally in copied text.
Why it breaks code:
text.replace(/\u200D/g, "")
The Word Joiner prevents line breaks between characters. Like other invisible characters, it may appear when copying text from formatted documents.
Why it breaks code:
text.replace(/\u2060/g, "")
The Non Breaking Space looks identical to a regular space but behaves differently. It often appears when copying text from HTML pages or formatted editors.
Why it breaks code:
text.replace(/\u00A0/g, " ")
The Byte Order Mark (BOM) is sometimes inserted at the beginning of UTF-8 files. If present in JSON data, it can cause immediate parsing failures.
Why it breaks code:
text.replace(/\uFEFF/g, "")
The Unicode Line Separator is not always treated as valid whitespace in JavaScript or JSON. It can cause parsing issues when embedded in strings.
text.replace(/\u2028/g, "")
The Paragraph Separator behaves similarly to the Line Separator and may break JavaScript or JSON when inserted into text data.
text.replace(/\u2029/g, "")
Manually searching for invisible characters can be extremely difficult. A dedicated cleaning tool helps detect and remove hidden Unicode characters instantly.
Unicode Cleaner detects characters like:
Try the tools: