L2/01-169

Title:         Normative Changes in Unicode 3.1

Distribution:  Unicode Liaisons

Date:          April 17, 2001

Source:        Lisa Moore

               Unicode Technical Committee

 

 

The Unicode Standard 3.1 was published on March 30, 2001, and can be found on the Unicode web site at: http://www.unicode.org/unicode/reports/tr27

 

Listed here for your convenience is a summary of the normative changes in this revision of the Standard. However, we would still urge you to review the entire document for any changes relevant to your organization.

 

Normative changes in Unicode 3.1:

 

New character allocations: Unicode 3.1 adds 44,946 encoded characters:
42,711 new ideographs, many additional symbols, historical scripts, and tag characters.

 

Supplementary characters: Characters are now encoded beyond the original 16-bit codespace (BMP).

 

Amended data files: Data files have been updated to account for the new repertoire of characters.

 

UTF-32: Unicode now has three sanctioned encoding forms: UTF-8, UTF-16, and UTF-32.

 

Noncharacters: Thirty-two more noncharacters have been added and the status of all sixty-six noncharacters has been clarified.

 

UTF-8 corrigendum: The definition of UTF-8 has been corrected for a security issue: conformant implementations now cannot interpret non-shortest forms for BMP characters.

 

Special character properties: Music format control characters were added to the list of special character properties.

 

UAX#15 Unicode Normalization Forms:  U+FB1D YOD WITH HIRIQ has been added to the Composition Exclusion List.

 

New normative properties: All of the General Category values plus the case mappings in UnicodeData.txt and SpecialCasing.txt are now normative; Cn is the default value for the general category for all unassigned code points and noncharacters.

 

Normative references: The use of normative references to Unicode properties by other specifications was clarified.