The new scripts and characters in Version 14.0 add support for modern language groups in Bosnia, India, Indonesia, Iran, Java, Malaysia, Mongolia, Myanmar, Pakistan, and the Philippines, plus other languages in Africa and North America, including:
- Arabic script additions that include honorifics and additions for Quranic use, and characters used to write languages across Africa, the Balkans, and South and Southeast Asia
- The Vithkuqi script historically used to write Albanian and currently undergoing a modern revival
- The Tangsa script used to write the Tangsa language, spoken in India and Myanmar
- The Toto script used to write the Toto language in northeast India
- Many Latin script additions for extended IPA
- 37 emoji characters, including several new emoji for emotion and hand gestures (smileys, hands, animals and nature, food and drink, transport, and activities). For the full list of new emoji characters, see emoji additions for Unicode 14.0, and Emoji Counts. For a detailed description of support for emoji characters by the Unicode Standard, see UTS #51, Unicode Emoji.
- The som currency sign used in the Kyrgyz Republic
- Znamenny musical notation developed in Russia
- Cypro-Minoan, historically used primarily on the island of Cyprus
- Old Uyghur, historically used in Central Asia and elsewhere to write Turkic, Chinese, Mongolian, Tibetan, and Arabic languages
- Ahom, Balinese, Brahmi, Canadian aboriginal languages, Glagolitic, Kaithi, Kannada, Mongolian, Tagalog, Takri, and Telugu
- Arabic support for Hausa, Wolof, Hindko, and Punjabi, and Ethiopic support for Gurage
- Significant updates to the CJK auxiliary blocks and enclosed alphanumerics
Five important Unicode annexes updated for Version 14.0:
- UAX #14, Unicode Linebreaking Algorithm
- UAX #29, Unicode Text Segmentation
- UAX #31, Unicode Identifier and Pattern Syntax
- UAX #38, Unicode Han Database (Unihan)
- UAX #45, U-Source Ideographs
- UTS #10, Unicode Collation Algorithm — sorting Unicode text
- UTS #39, Unicode Security Mechanisms — reducing Unicode spoofing
- UTS #46, Unicode IDNA Compatibility Processing — compatible processing of non-ASCII URLs
Over 144,000 characters are available for adoption
to help the Unicode Consortium’s work on digitally disadvantaged languages