[Unicode]  The Unicode Standard Home | Site Map | Search
 

Corrigendum #2: Yod with Hiriq Normalization

 
Corrigendum Effective Date Applicable Versions Fixed Version Result Documented In:

Corrigendum #2: Yod with Hiriq Normalization

2001-Jan-31
[86-M20]
3.0.0 and 3.0.1 3.1.0
2001-March
CompositionExclusions.txt

In the production of the normalization tables for Unicode 3.0, the character U+FB1D HEBREW LETTER YOD WITH HIRIQ was mistakenly omitted from Composition Exclusions. During the public review period, this mistake was reported, but the report was misinterpreted and thus overlooked. This corrigendum corrects that omission.

Add the following entry to CompositionExclusions.txt in the Scripts Specifics section of that data file:

FB1D # HEBREW LETTER YOD WITH HIRIQ

This corrigendum does affect backwards compatibility of normalization forms NFKC and NFC for strings containing this character. Text containing this character that is in normalization form NFKC or NFC as defined in Unicode 3.0 is no longer in that normalization form after the application of this corrigendum. It is recommended that all implementations of those normalization forms upgrade to the Unicode 3.1 data tables (or later), to ensure interoperability with later versions of the standard.

Background

The reasons for issuing this corrigendum are enumerated below.

  • The omission had been reported during the public review period for Unicode 3.0.
  • There were no normative references to Unicode 3.0 Normalization from our liaison organizations (particularly IETF and W3C), although normative references are expected soon after Unicode 3.1.
  • YOD WITH HIRIQ is one of a class of characters ("marked" Hebrew presentation forms within the range U+FB1D .. U+FB4E) that were to be handled all in the same way, during all review and discussion of Normalization in the UTC. The other characters in this class were uniformly included in Composition Exclusions.
  • YOD WITH HIRIQ is a very rare character. The amount of existing data containing it is infinitesimal as a proportion of all computerized text. Even if it takes some time for implementations to upgrade, this change should pose no significant backwards-compatibility issue in practice.

Access to Copyright and terms of use