L2/04-123 Source: Mark Davis Title: Items from 4.0.1 Date: 2004/03/22 The following items came up in during the release of 4.0.1. 1. The editorial committee was unable to make the change to Ethiopic Digits as directed, because that would have broken backwards compatibility for identifiers. When discussing the identifier compatibility issue, the UTC added a new property Other_ID_Start, and in discussion said that if it were ever necessary, an Other_ID_Continue should be added. However, this was not captured in an action, and the editorial committee felt that it would exceeding its authority to add a new property in this release. The proposal is to add such a new property, allowing the Ethiopic Digits to be changed in the next appropriate version of the standard. Note: any change in Nd also affects line break, since the NU class is intended to contain all Nd (plus the two Arabic numeric separators). 2. There is one property that is in the UCD data files, but not captured in the the documentation, nor derivable from other properties. In CaseFolding there is a property marked by the letter T (for Turkic), with the following values: 0049; T; 0131; # LATIN CAPITAL LETTER I 0130; T; 0069; # LATIN CAPITAL LETTER I WITH DOT ABOVE The proposal is to add a new property Turkic_Case_Folding_Exceptions, with the above values. 3. There was a reported inconsistency between the text of UAX#14 and the data file regarding the linebreak class of CGJ. The text describes it as GL (glue) and gives a rationale for that. The data file assigns CM (Combining marks and attached characters). That difference dates from the time of addition of the CGJ to the standard. This needs to be resolved one way or the other. Mark