L2/05-211 Source: Mark Davis Date: August 5, 2005 Subject: Suggestions for UAX #15 Ken had the following suggestions for UAX #15, that came up in discussion of the stability policies. I appended some email that I had sent to someone in the IETF on normalization that may also be useful material. ----- From Ken: Perhaps the stability section of UAX #15 should be elaborated in the next version to make it pedantically clear: A. What is being guaranteed into the future. (staying normalized, but not necessarily normalizing to the *same* string) B. What is true back to the applicable version *if* the corrigenda are applied. (normalizing to the *same* string) C. What is true back to the applicable version if the corrigenda are not applied. (staying normalized, but not necessarily normalizing to the *same* string, and in some [confused] implementations having to normalize twice to get a stable result) D. How to make use of NormalizationCorrections.txt between any two versions back to the applicable version so as to guarantee normalizing to the *same* string. I.e., guaranteeing the desired output of B in instances where the client deliberately is *not* applying the corrigenda. ----- From Mark: ... And the very few data changes we made early on were all verified to be ones that can be accounted for in a simple addition to the mapping table in NamePrep, for you to maintain backward compatibility. Thus we have always maintained the invariants: 1. Any changes are guaranteed not to disturb the stability of previous *normalized* strings. That is, if on system A, normalize(X) = X', then on system B, normalize(X') = X'. (This is given that the characters are all defined in the version of Unicode used on A and B.) Thus once characters are normalized, they stay normalized. 2. All normalization corrections can be implemented -- or avoided -- by the strinprep mapping (Section 3 of [StringPrep]). That is, suppose that on Unicode 3.2, X normalizes to X', but on Unicode 4.1, X normalizes to X". Because of #1 above: A. To simulate Unicode 4.1 action on a 3.2 system, one merely adds a line to the StringPrep mapping: X => X" B. To simulate Unicode 3.2 action on a 4.1 system, one merely adds a line to the StringPrep mapping: X => X' Because all normalization changes are guaranteed to leave X" and X' alone, this works, and involves no architectural changes to StringPrep; only small additions to the mapping tables.