Unicode Converter

Convert between Unicode and characters

Examples:

A→U+0041
€→U+20AC
δΈ­β†’U+4E2D
πŸ˜€β†’U+1F600

About This Tool

Unicode is a universal character encoding standard that assigns a unique code point to every character in every language. This converter helps you convert between characters and their Unicode representations in various formats.

What is Unicode?

Unicode is a computing industry standard for consistent encoding, representation, and handling of text expressed in most of the world's writing systems. It contains over 143,000 characters covering 154 modern and historic scripts, as well as emoji, symbols, and mathematical notation. Each character is assigned a unique code point, typically written as U+XXXX in hexadecimal.

Unicode Formats

  • U+ Format: Standard Unicode notation (U+0041 for 'A')
  • Hex Format: JavaScript/JSON escape sequence (\u0041)
  • Decimal Format: HTML numeric entity (A)
  • Hex HTML: HTML hex entity (A)

Common Use Cases

Unicode conversion is essential for web development when handling international characters, displaying special symbols and emoji, debugging character encoding issues, working with escape sequences in programming, and ensuring cross-platform text compatibility. It's particularly useful when dealing with characters outside the ASCII range or when you need to inspect the exact code point of a character.

UTF-8 vs UTF-16

While Unicode defines code points, UTF-8 and UTF-16 are encoding schemes that determine how those code points are stored in bytes. UTF-8 uses 1-4 bytes per character and is backward compatible with ASCII, making it the most popular encoding for the web. UTF-16 uses 2 or 4 bytes and is used internally by JavaScript and many operating systems. Both encodings represent the same Unicode characters but store them differently in memory.

Emoji and Special Characters

Modern Unicode includes thousands of emoji and special characters. Emoji typically use code points above U+1F000 and may consist of multiple code points combined (like skin tone modifiers or combined emoji). This converter handles all Unicode planes, including the Supplementary Multilingual Plane where emoji reside, supporting characters from U+0000 to U+10FFFF.