|
|
Updates and Errata
The following is a list of errata noted for
The Unicode
Standard, Version 5.1, its code charts, annexes and
the Unicode Character Database. It is periodically updated to include
corrections to typographic errors and new clarifications of the
text. This list also includes errata noted for the text of the book,
The Unicode Standard, Version 5.0, and not yet corrected in consolidated text.
Formal corrigenda notices for the Unicode Standard can be found at
Corrigenda to the Unicode Standard.
Corrigenda for the Unicode CLDR
are posted at
Unicode CLDR Corrigenda, and errata notices for UTS #35: Locale Data Markup Language (LDML)
can be found at
Errata for
UTS #35 LDML.
Reports of errors in published documents, such as the Unicode
Standard itself or Unicode Technical Reports, may be filed using the
Unicode Consortium's
online form. If confirmed, and depending on the nature of the
reported error, an erratum may be posted on this page, to be fixed
in subsequent editions of the Standard.
| Date |
Summary |
| 2008-May-7 |
In UAX #31, "Unicode Identifier and Pattern Syntax" (Version 5.1.0), there is a typo in the description for (X)ID_Start in Table 2, Lexical Classes for Identifiers. "letter numbers (Lu)" should be corrected to read "letter numbers (Nl)". |
| 2008-April-29 |
In UAX #29, "Unicode Text Segmentation" (Version 5.1.0), there is a typo in the definition of Prepend
in Table 2, Grapheme_Cluster_Break Property Values. The correct definition is: "Logical_Order_Exception=True". |
| 2008-April-28 |
In the Version 5.1 Unicode Character Database, the test cases in the test data
file LineBreakTest.txt incorrectly indicate the presence of a break at the
beginning of each line (with "÷"). These should be corrected to indicate no
break at the beginning of each line (with "×"), to reflect the effect of LB2
"Never break at the start of text" from
UAX #14, "Unicode
Line Breaking Algorithm".
Correspondingly, the documentation in LineBreakTest.html should have the rule
0.2 corrected to read: "sot ×".
|
| 2008-February-12 |
On p. 124 of The Unicode Standard, Version 5.0, there is an error in the Regular Expressions column for
"More_Above", in the third row of
Table 3-14, Context Specification for Casing.
The corrected regular expression should be:
[^\p{ccc=230}\p{ccc=0}]* [\p{ccc=230}]
|
| 2007-June-26 |
The following text from the last paragraph of
Section 15.4, Mathematical Symbols, on page 507 of The Unicode
Standard, Version 5.0: Using U+2278 or U+2279 with VS1 will
request these variants explicitly, as will using U+2276 less-than
or greater-than or U+2277 greater-than or less-than with U+20D2
combining long vertical line overlay. Unless fonts are created
with the intention to add support for both forms (via VS1 for the
upright forms),...
Should be replaced by this text:
Using U+2276 or U+2277 followed by U+20D2 COMBINING LONG
VERTICAL LINE OVERLAY represents these upright variants
explicitly. Except for those fonts created with the intention to
add support for both forms (via combination of U+2276 or
U+2277 with U+20D2 for the upright forms),... |
| 2007-June-4 |
In Section 12.1, Han on p. 424 of The Unicode Standard, Version 5.0, the last paragraph states that U+FA70 to U+FAD9 are "included in the Unicode Standard to provide full round-trip compatibility with the ideographic repertoire of PKS 5700 parts 1, 2, and 3." However, the Korean standard listed is incorrect, and the text should be corrected to "... the ideographic repertoire of KPS 10721-2000."
|
| 2007-May-24 |
On p. 479 of The Unicode Standard, Version 5.0, the
subheading for Linear B Ideographs lists the range as
"U+10080--U+108FF". That should be corrected to
"U+10080--U+100FF". |
| 2007-January-11 |
There is an error in the entry for "Trailing Consonant" on page
1147 in the glossary of The Unicode Standard, Version 5.0.
"Vowel_Jamo" should be "Trailing_Jamo" in definition (1), thus reading "(1) In Korean, a jamo character with the Hangul_Syllable_Type property value
Trailing_Jamo (in the range U+11A8..U+11F9)." |
| 2007-January-5 |
There is an error in the sample code in section 5.17 on page
182 of The Unicode Standard, Version 5.0. The entry 0x2F in the second row of the rotate table should instead be 0x1F. |
| 2007-January-4 |
On page 411 of The Unicode Standard, Version 5.0, Table 12-2 incorrectly states the extent of the CJK Unified Ideographs Extension A block. The correct range is U+3400 to U+4DBF. In particular, the Yijing Hexagram
Symbols starting at U+4DC0 are not part of Extension A. |
| 2007-January-2 |
Due to a printing error, the Unified Canadian Aboriginal
Syllabics glyphs at U+1424, U+1426, and U+1487 are missing in the code charts
and names list on pages 684 and 687-88 of The Unicode Standard,
Version 5.0.
These glyphs were correctly represented in the online charts and can be viewed at
http://www.unicode.org/charts/PDF/U1400.pdf. |
| 2007-January-2 |
The file UNIHAN/FullRSIndex.pdf on the Unicode 5.0 CD-ROM is missing a final page
with the last half of the entry for 211 (tooth) and the complete entries for 212 (dragon), 213 (turtle), and 214 (flute). The
missing page is available
here
as a PDF. |
| 2006-December-21 |
Table 11-16 in The Unicode Standard, Version 5.0 shows "kyu" twice:
once at the top of part on page 402 and once at the top of the
part on page 403. The repetition is an error and the second
instance should be removed. |
|
|