Education and The Work of the Consortium

The Work of the Consortium

Whether one is a student, educator, scholar, computer scientist, or the member of an organization working with developing countries, one can benefit from and contribute to the work of supporting the world's languages in computer use. The Unicode Standard today supports 93 writing systems (or scripts) in addition to many important symbol sets. A great foundation has been laid, but there is more to do.

The Consortium welcomes participation in its ongoing work, as well as in that of other organizations working to add scripts not yet included in the Unicode Standard.

New Characters and Script Encoding. The Unicode Consortium calls the process of adding new scripts and characters the encoding process. Encoding the world's writing systems is an ongoing project. With approximately 6,000 languages spoken in the world today, there are still many characters and writing systems that are unencoded. The Consortium welcomes proposals from individuals and organizations. To submit proposals, see information on the Unicode website "Submitting New Characters or Scripts".

In addition to the work of the Consortium itself, two initiatives that also focus on this important work are the Script Encoding Initiative, located at the University of California, Berkeley, and ScriptSource, sponsored by SIL International.

The Unicode Locales Project - CLDR. Unicode CLDR locales provide the key building blocks for software to support the world's languages, with the largest and most extensive standard repository of locale data that is freely available. This data is used by a wide spectrum of companies and organizations to adapt their software to the conventions of different languages for such common software tasks as:

formatting of dates, times, and time zones

formatting numbers and currency values

sorting text

choosing languages or countries by name, in one's native language

This adaptation process is called internationalization and localization.

The Consortium partners with many individuals, companies, and organizations to collect, validate and publish new locale data information.

UDHR in Unicode Project. The goal of the UDHR in Unicode project is to demonstrate the use of Unicode for a wide variety of languages using the Universal Declaration of Human Rights as the text. The Universal Declaration of Human Rights (UDHR) was selected because it is available in a large number of languages from the Office of the United Nations High Commissioner for Human Rights (OHCHR) at http://www.ohchr.org/EN/UDHR/.

Participation in this project is very welcome, either by reviewing existing translations or by providing new ones. Many thanks to those who have already contributed.

Script Codes.The International Organisation for Standardization (ISO) has appointed the Unicode Consortium as the Registration Authority for the international standard ISO15924, Codes for the representation of names of scripts. The ISO 15924 Registration Authority receives and reviews applications requesting new script codes and to change existing codes. It maintains a list of information associated with registered script codes, processes updates of registered script codes, and distributes them on a regular basis to subscribers and other parties.

Script codes are important because they are used to identify the scripts used in many different application environments. Libraries use these codes to identify the languages of books and other materials in their collections. Computers use these codes in email and web pages to identify language environments, and all search engines separate results by script as part of their processing.

New script codes are always being proposed and evaluated. Proposals for additions and changes can be made with the request form.

Unicode Technical Notes. These publications provide information on a variety of topics related to Unicode and internationalization technologies. Technical Notes are independent publications not subject to
technical committee review and are not part of the standard.

Existing Technical Notes address many topics of interest, including:

South and South East Asian languages

Rendering combining marks

Collation topics

Character set conversions

For more information on the process for proposing and publishing Technical Notes see About Unicode Technical Notes.