L2/10-317

PRI #173 Invariant Tests

An internal file of machine-readable data is used to test Unicode invariants for each release of Unicode. This PRI proposes to add that file to the Unicode Character Database (UCD), making it available for public use. The data documents what is tested prior to the release of a version of the UCD, and can also be used for testing implementations, where desired. UAX #44 would be augmented with a short section documenting the structure and usage based on the header of that file.

We would appreciate any feedback as to whether this file should be part of the UCD.

The file UnicodeInvariantTest.txt would be included in the UCD. The file UnicodeTestResults.html would not be included in the UCD, but is given here for reference. It shows an annotated version of the UnicodeInvariantTest.txt file, where tables are added showing the results of assignment statements and test failures, in this case based on beta data for Unicode 6.0.

Many of the invariants are stability constraints from the Unicode Stability Policies. Each of those is marked with "Stability" in the preceding comment. Other invariants are property constraints established by other standards, such as the Regex properties alpha, alphanum, etc. Others are "red flag" invariants, which are simply used to detect when a change in property value might be problematic. Typically those have a set of exceptions (inclusions or exclusions) that are modified for each release.