From: Ted Hopp (ted@newslate.com)
Date: Mon Jul 28 2003 - 20:47:49 EDT
Okay, Ken. I'm beginning to get it after reading your thoughtful
explanations and after reading through the following two documents (highly
recommended to all following this thread):
http://www.w3.org/TR/WD-charreq
http://www.w3.org/TR/charmod/
After reading through some of the archives (some pointers to the relevant
parts would be helpful, please--something beyond "consult the archives"), it
strikes me that normalization, with its severe requirements, is going to
eventually so distort Unicode that it will render it nearly unusable.
Consider the thread that starts at
http://www.unicode.org/mail-arch/unicode-ml/Archives-Old/UML020/0651.html
(from 1999, for goodness sake!): if umlaut had been a later addition to
Unicode, no vowel-umlaut code could be allowed to have a decomposition to
vowel + umlaut after the umlaut was introduced (else normalization
idempotence breaks). Conversely, if umlaut, but none of the composed
vowel-umlaut characters, had been in from the start, when the latter were
added they would all have to go into the compositions exclusions list (else
normalization idempotence breaks). Obviously, neither occurred with umlaut,
but the point is, I hope, clear. Normalization will ossify Unicode: it will
become harder and harder to accept new, clean encodings. This is truly going
to become the tail that wags the dog.
My prediction: normalization will eventually force some sort of version
indicator to be included in all (normalized) Unicode text. (Weak analogy:
much as DTD references are, either explicitly or implicitly, part of all XML
documents).
Normalization and its applications (such as early normalization for string
identity matching) may indeed be the show-stopper (today), so this question
may be moot, but I'll ask it anyway: Are there any other uses of combining
classes that would break (in ways apart from normalization breaking) if the
assignments for the Hebrew vowels were changed? We might as well be sure
that we know the entire scope of the issues involved.
Ted
Ted Hopp, Ph.D.
ZigZag, Inc.
ted@newSLATE.com
+1-301-990-7453
newSLATE is your personal learning workspace
...on the web at http://www.newSLATE.com/
This archive was generated by hypermail 2.1.5 : Mon Jul 28 2003 - 21:22:03 EDT