The Unicode CLDR v42 Alpha is now available for integration testing.
CLDR provides key building blocks for software to support the
world's languages (dates, times, numbers, sort-order, etc.) For example, all
major browsers and all modern mobile phones use CLDR for language support. (See
Who uses CLDR?)
Via the online Survey Tool, contributors supply data for their
languages — data that is widely used to support much of the world’s software.
This data is also a factor in determining which languages are supported on
mobile phones and computer operating systems.
The alpha has already been integrated into the development version
of ICU. We would especially appreciate feedback from non-ICU consumers of CLDR
data and on
Migration issues. Feedback can be filed at
CLDR Tickets.
Alpha means that the main data and charts are available for review,
but the specification, JSON data, and other components are not yet ready for
review. Some data may change if showstopper bugs are found. The planned schedule
is:
- Sep 14 — Beta (data)
- Sep 28 — Beta2 (spec)
- Oct 19 — Release
In CLDR 42, the focus is on:
-
Locale coverage. The following locales now have higher coverage levels:
- Modern: Igbo (ig), yo (Yoruba)
- Moderate: Chuvash (cv), Xhosa (xh)
- Basic: Haryanvi (bgc), Bhojpuri (bho), Rajasthani (raj), Tigrinya (ti)
-
Formatting Person Names. Added data and structure for formatting people's names. For more information on why this feature is being added and what it does, see Background.
-
Emoji 15.0 Support. Added short names, keywords, and sort-order for the new Unicode 15.0 emoji.
-
Coverage, Phase 2. Added additional language names and other items to the Modern coverage level, for more consistency (and utility) across platforms.
-
Unicode 15.0 additions. Made the regular additions and changes for a new release of Unicode, including names for new scripts, collation data for Han characters, etc.
There are many other changes: to find out more, see the draft
CLDR v42 release page, which has information on accessing the date, reviewing charts of the changes, and — importantly —
Migration issues.
In version 42, the following levels were reached:
Level
|
Languages
|
Locales*
|
Notes
|
Modern
|
94
|
366
|
Suitable for full UI internationalization
|
Afrikaans, … Čeština, … Dansk, … Eesti, … Filipino, … Gaeilge, … Hrvatski, Indonesia, … Jawa, Kiswahili, Latviešu, … Magyar, …Nederlands, … O‘zbek, Polski, … Română, Slovenčina, … Tiếng Việt, … Ελληνικά, Беларуская, … ᏣᎳᎩ, Ქართული, Հայերեն, עברית, اردو, … አማርኛ, नेपाली, … অসমীয়া, বাংলা, ਪੰਜਾਬੀ, ગુજરાતી, ଓଡ଼ିଆ, தமிழ், తెలుగు, ಕನ್ನಡ, മലയാളം, සිංහල, ไทย, ລາວ, မြန်မာ, ខ្មែរ, 한국어, … 日本語, …
|
Moderate
|
7
|
11
|
Suitable for full “document content” internationalization, such as formats in a spreadsheet.
|
Binisaya, … Èdè Yorùbá, Føroyskt, Igbo, IsiZulu,
Kanhgág, Nheẽgatu, Runasimi, Sardu, Shqip, سنڌي, …
|
Basic
|
29
|
43
|
Suitable for locale selection, such as choice of language in mobile phone settings.
|
Asturianu, Basa Sunda, Interlingua, Kabuverdianu, Lea Fakatonga, Rumantsch, Te reo Māori, Wolof, Босански (Ћирилица), Татар, Тоҷикӣ, Ўзбекча (Кирил), کٲشُر, कॉशुर (देवनागरी), …, মৈতৈলোন্, ᱥᱟᱱᱛᱟᱲᱤ, 粤语 (简体)
|
* Locales are variants for different countries or scripts.
Over 144,000 characters are available for adoption
to help the Unicode Consortium’s work on digitally disadvantaged languages