Thursday, February 23, 2023

The Unicode CLDR v43 Alpha is now available for integration testing

[image] CLDR provides key building blocks for software to support the world's languages (dates, times, numbers, sort-order, etc.). For example, all major browsers and all modern mobile phones use CLDR for language support. (See Who uses CLDR?)

Via the online Survey Tool, contributors supply data for their languages — data that is widely used to support much of the world’s software. This data is also a factor in determining which languages are supported on mobile phones and computer operating systems.

The Alpha has already been integrated into the development version of ICU. We would especially appreciate feedback from non-ICU consumers of CLDR data and on Migration issues. Feedback can be filed at CLDR Tickets.

Alpha means that the main data and charts are available for review, but the specification, JSON data, and other components are not yet ready for review. Data may change if release-blocking bugs are found. The planned schedule is:
  • 2023 Mar 15, Wed — public Beta (data)
  • 2023 Mar 29, Wed — public Beta2 (data & spec)
  • 2023 Apr 12, Wed — Release
CLDR 43 is a limited-submission release, focusing on just a few areas:
  1. Formatting Person Names
    • Completing the data for formatting person names, allowing it to advance out of “tech preview”. For more information on the benefits of this feature, see Background.
  2. Adding substantially to the LikelySubtags data
    • This is used to find the likely writing system and country for a given language, used in normalizing locale identifiers and inheritance.
    • The data has been contributed by SIL.
  3. Other data updates
    • Alternate names for Turkey / Türkiye
    • Name for the new timezone Ciudad Juárez
  4. Structure
    • Adding some structure and data needed for ICU4X & JavaScript, for calendar eras and parentLocales.
    • Cleanup of the inheritance structure in CLDR
  5. Collation & Searching
    • Treat various quote marks as equivalent at a Primary strength, also including Geresh and Gershayim.

To find out more about these and other changes, see the draft CLDR v43 release page, which has information on accessing the date, reviewing charts of the changes, and — importantly — Migration issues.

Support Unicode
To support Unicode’s mission to ensure everyone can communicate in their languages across all devices, please consider adopting a character, making a gift of stock, or making a donation. As Unicode, Inc. is a US-based open source, open standards, non-profit, 501(c)3 organization, your contribution may be eligible for a tax deduction. Please consult with a tax advisor for details.


Tuesday, February 7, 2023

Unicode 15.1 Alpha Review Opens for Feedback

[image] The repertoire for Unicode 15.1 is now open for early review and comment. As a reminder, during alpha review the repertoire is reasonably mature and stable, but is not yet completely locked down. Discussion regarding whether certain characters should be removed from the repertoire for publication is welcome. Character names and code point assignments are reasonably firm, but suggestions for improvement may still be entertained.

This early review is provided so that reviewers may consider the character repertoire and data file issues prior to the start of beta review (currently scheduled to start in May 2023). Once beta review begins, the repertoire, code points, and character names will all be locked down, and no longer be subject to changes.

Notable Changes

Unicode 15.1 adds exactly five characters, for a total of 149,191 characters. The five new characters are Ideographic Description Characters that are used in Ideographic Description Sequences, which represent a mechanism to visually describe the structure of ideographs.

In addition, the code charts for the CJK Unified Ideographs, CJK Unified Ideographs Extension A, and CJK Unified Ideographs Extension B blocks now include representative glyphs and source references for nearly 24,000 KP-source ideographs. Furthermore, the format of the code charts for the CJK Unified Ideographs block was updated to accommodate KP-source ideographs through the addition of a seventh column.

Version 15.1 does not add new emoji characters, however, 118 new RGI emoji ZWJ sequences will be defined.

Feedback for the alpha review should be reported under PRI #473 using the Unicode contact form by April 4, 2023.

Support Unicode
To support Unicode’s mission to ensure everyone can communicate in their languages across all devices, please consider adopting a character, making a gift of stock, or making a donation. As Unicode, Inc. is a US-based open source, open standards, non-profit, 501(c)3 organization, your contribution may be eligible for a tax deduction. Please consult with a tax advisor for details.


Monday, February 6, 2023

Announcing New Unicode Adopt-a-Character Site

The Adopt-a-Character program was launched in 2015. Since that time, AAC funds have supported Unicode's mission to ensure everyone can communicate in their own language. This includes preserving historical scripts such as Egyptian hieroglyphics and providing better language support for digitally disadvantaged and under-resourced languages such as Hanifi Rohingya used in Myanmar and Bangladesh.

Now you can more easily adopt a character and show off your hobby or business, favorite sport, or love – while also supporting a good cause. You can also give the gift of a letter to someone in your life. The possibilities are endless – and each adoption helps Unicode’s goal to support the world’s languages.

All character adoptions are permanent. Adoption of a specific character at the limited gold and silver levels is on a first-come-first-served basis. All sponsors receive a digital badge and are recognized on Unicode’s website, Twitter feed, and Friends of Unicode Facebook page.

To start your adoption, visit our new page!

Unicode, Inc. is a non-profit, 501(c)3 organization and contributions may be eligible for a tax deduction. Please consult with a tax expert for details.
