White space represent by \x20 \u0020 u+00000020 %20
Unicode Converter - Free online Encode/Decode String Characters
Unicode Converter - What is Unicode ?
Unicode is a Universal Character Set (UCS), the standard characters encoding which does not depend on any platform systems. Today, Unicode become the most of system encoding to represent characters as integers. That means every character in the world including number, language alphabet, symbol or emoticon will be assigned a unique number for the Unicode system which we call its code-point. Unicode can be encoded by several type encodings such as UTF-8, UTF-16, UTF-32, and some other legacy encoding.
How to encode Unicode format ?
Unicode defines a mapping method by Unicode Transformation Format (UTF). As we mentioned before, UTF encoding map all characters on code points, which is a unique sequence of bytes. There are many types of UTF encoding which defined by prefix UTF-N. “N” is a numeric to defines the number of bits per code value.
8 bits encoding and also used in most of the website(WWW). Compatibility with ASCII because the first 128 Unicode code points are the same as ASCII and no need Byte Order Mark (BOM). UTF-8 is a default encoding in XML and HTML
32 bits encoding is widely used as an internal representation of text in programs including Python programming language since Python version 2.2
Unicode encoding table
Why there are serveral type of UTF ?
As we see in the Unicode encoding table, each version of UTF requires various resources.
UTF-8 required lower space of disk and memory because it uses 8 bits to store the data. The lower code range (000000 – 00007F) which is used for ASCII (Most of the American standard characters) will take this benefit completely. However, for other languages particularly on Asia alphabet require more than 2 bytes to store in each character. The consequence is the system needs to compute 2 times for a character.
UTF-16 become more friendly programming on Asia alphabets and special symbols. The point is located space is the same as UTF-8 but it is easier to compute faster for middle range characters (000080 – 00FFFF).
UTF-32 is not widely used at the present because it needs amounts of space. It has become more effective for high range characters or new emoticon symbol.
“The conclusion is UTF with lower bits encoding will save the space resource but consume more compute resource. UTF with higher bits encoding will gain the opposite resource consuming.”
Base64 Converter - Encode / Decode
Base64 is a kind of data encryption. Base64 transform and encode original data to text format. These encode data is unreadable by a human. The method of Base64 encoding is encoded to replacing the data with 64 ASCII characters (Text characters). That why we call its Base64 encode.
Base64 index table
How to convert Unicode ?
- UTF8 Converter (Unicode to UTF-8 Encoding)
- UTF16 Converter (Unicode to UTF-16 Encoding)
- UTF32 Converter (Unicode to UTF-32 Encoding)
- Base64 Converter (Unicode to Base64 Encoding)
- URL Converter (Unicode to URL Encoding)
- Decimal Converter (Unicode to Decimal converter)