UTF-8
| Standard | Unicode Standard |
|---|---|
| Classification | Unicode Transformation Format, extended ASCII, variable-length encoding |
| Extends | ASCII |
| Transforms / Encodes | ISO/IEC 10646 (Unicode) |
| Preceded by | UTF-1 |
UTF-8 is a character encoding standard used for electronic communication. Defined by the Unicode Standard, the name is derived from Unicode Transformation Format – 8-bit.[1] As of July 2025, almost every webpage is transmitted as UTF-8.[2]
UTF-8 supports all 1,112,064[3] valid Unicode code points using a variable-width encoding of one to four one-byte (8-bit) code units.
Code points with lower numerical values, which tend to occur more frequently, are encoded using fewer bytes. It was designed for backward compatibility with ASCII: the first 128 characters of Unicode, which correspond one-to-one with ASCII, are encoded using a single byte with the same binary value as ASCII, so that a UTF-8-encoded file using only those characters is identical to an ASCII file. Most software designed for any extended ASCII can read and write UTF-8, and this results in fewer internationalization issues than any alternative text encoding.[4][5]
UTF-8 is dominant for all countries/languages on the internet, is used in most standards, often the only allowed encoding, and is supported by all modern operating systems and programming languages.
- ^ Unicode® 6.0.0: Released: 2010 October 11 (Announcement) (6.0.0 ed.). Mountain View, California, US: The Unicode Consortium. ISBN 978-1-936213-01-6. Archived from the original on 2025-07-28. Retrieved 2025-08-23.
- ^ Cite error: The named reference
W3TechsWebEncodingwas invoked but never defined (see the help page). - ^ "Conformance". Unicode 16.0.0: Core Spec / Chapter 3 (6.0.0 ed.). Mountain View, California, US: The Unicode Consortium. 3.9 Unicode Encoding Forms. ISBN 978-1-936213-34-4. Archived from the original on 2025-07-01. Retrieved 2025-08-23.
Each encoding form maps the Unicode code points U+0000..U+D7FF and U+E000..U+10FFFF
- ^ Cite error: The named reference
Microsoft GDKwas invoked but never defined (see the help page). - ^ Cite error: The named reference
whatwgwas invoked but never defined (see the help page).