L2/04-167 Date/Time: Mon Apr 12 19:26:21 EDT 2004 Contact: Asmus Freytag Report Type: Error Report Subject: Inconsistency between data files in 4.0.1 In 4.0.1 an inconsistency has been introduced in the tags for certain properties: 1) compatibility decomposition tags used to be all lower case. From 4.0.1 these are Titlecased in PropertyValueAliases.txt and DerivedCompatibilityTags.txt while still lowercase in UnicodeDatat.txt and NamesList.txt 2) the joining group tags used to be all upper case From 4.0.1 these are Titlecased in PropertyValueAliases.txt and DerivedJoiningGroup.txt while still uppercase in ArabicShaping.txt While the casing differences do not affect the identity of these tags, the change does affect software. We allow the use of property tags in *implementations* in all casing variations and with spaces, hyphens or underscores inserted, in order to make it possible to create identifiers from them, even in contexts where specific rules or conventions proscribe or prohibit certain styles. However, that's not the same as changing the casing etc. of these in the database files. Doing that requires every piece of sotware that tries to correlate the various files to implement this form of ad-hoc loose matching. Having an inconsistent spelling between the datafiles and the property/value aliases makes it unnecessarily difficult to use simple tools like grep to look up all occurrences of a given tag across the UCD. It also breaks some other software. I noticed this when my character browser, http://www.unicode.org/unibook was no longer able to read Arabic Shaping data from the datafile. There may be other software out there, whether publicly distributed as my viewer, or for internal use that runs into the same issue. This change makes creating diffs of the derived files useless unless you happen to use a case-insensitive tool. This particular casing should be considered an erratum and the casing restored for 4.1 to its original values.