From: John Cowan (cowan@mercury.ccil.org)
Date: Mon Jul 28 2003 - 21:46:09 EDT
Ted Hopp scripsit:
> After reading through some of the archives (some pointers to the relevant
> parts would be helpful, please--something beyond "consult the archives"),
The last week or two.
> if umlaut had been a later addition to
> Unicode, no vowel-umlaut code could be allowed to have a decomposition to
> vowel + umlaut after the umlaut was introduced (else normalization
> idempotence breaks). Conversely, if umlaut, but none of the composed
> vowel-umlaut characters, had been in from the start, when the latter were
> added they would all have to go into the compositions exclusions list (else
> normalization idempotence breaks).
Perfectly correct. However, precomposed versions are essentially
compatibility hacks, and we don't ever intend to add any more, still less
add precomposed versions for things that can be expressed in decomposed
form. It's taken a long time for combining marks to be supported, and
now it's taking off.
> but the point is, I hope, clear. Normalization will ossify Unicode: it will
> become harder and harder to accept new, clean encodings. This is truly going
> to become the tail that wags the dog.
New clean encodings aren't the problem. If we discovered a hitherto
unknown combining mark, it would be easy to add it, and there would be
no disturbance of stability, any more than when we add newly encoded
scripts. It's the old dirty encodings that need work-arounds, and there
cannot be very many of those.
> My prediction: normalization will eventually force some sort of version
> indicator to be included in all (normalized) Unicode text. (Weak analogy:
> much as DTD references are, either explicitly or implicitly, part of all XML
> documents).
Actually, they aren't. SGML documents, yes, but not XML ones.
-- There is / One art John Cowan <jcowan@reutershealth.com> No more / No less http://www.reutershealth.com To do / All things http://www.ccil.org/~cowan With art- / Lessness -- Piet Hein
This archive was generated by hypermail 2.1.5 : Mon Jul 28 2003 - 22:29:23 EDT