Re: Still can't work out whats a "canonical decomp" vs a "compatibility decomp"

From: Michael \(michka\) Kaplan (
Date: Thu May 08 2003 - 10:26:08 EDT

  • Next message: William Overington: "Re: Quest text font now available."

    From: <>

    > > I suppose there is always Bytext?
    > Hey, I was trying to be serious...

    So was I, in reverse. Here is the long form of the critical comment I made.

    The simple fact is that any attempt to take a complex system and either
    simplify it or make it more intuitive will only serve to make it more
    complex and harder to understand, in the long run. To wit:

    1) The whole notion of encoding the same thing multiple times exists for the
    sake of making it easier to be compatible with legacy encodings (as long as
    one ignores how confusing it is to have more than one way to say the same

    2) The whole notion of decompositions is to make it easier to handle #1 (as
    long as one ignores the obvious complexity and the conseqences of a

    3) The whole notion of compatibility vs. canonical decompositions is to try
    to distinguish between two different classes of decompositions that need to
    be treated two different ways (again one has to ignore the increased
    complexity and those mistakes again).

    4) The whole notion of normalization is to give a framework for implementers
    of Unicode to be able to define exactly what the text should be, given all
    of the above (as long as one ignores the time limit this places on getting
    the data in and of course the nightmare of different forms, more of whcih
    are created by implementers all the time!).

    5) The whole notion of the stability guarantees is to allow implementers to
    trust what Unicode sends out into the void (as long as one ignores the fact
    that it guves teeth to the bite of mistakes in #1-#4).

    6) The whole notion of new properties in the UCD is to try to make certain
    operations easier (as long as one ignores the obvious complexity increase
    and the normativity problems again)

    7) The whole notion of the glossary is to try to define the myriad of terms
    (as long as one ignores the size of this thing -- it maye be on track to be
    longer than the book by Unicode 6.0 at the rate we are going! <grin>)

    8) Sometimes it seems like the whole notion of this list (or at least its
    most useful purpose!) is to point all of the various problems and issues
    that come up due to the imperfect nature of each of the above items -- every
    single one of which served to make the standard more complex, despite the
    goal of trying to make it easier. :-)

    8) Bytext is a good example of a new kind of non-solution -- throw it all
    away and start over with a whole new set of issues and problems, plus
    inherit others for the sake of compatibiltiy. Everyone does have a right to
    be wrong -- we have been often enough, so why begrudge the rest of the

    We may serve everyone better to just doc what it is and work on improving
    *that* and not to keep coming up with yet another property or yet another
    mechanism to make everyone's life easier. Because those kinds of things
    never work... :-)


    This archive was generated by hypermail 2.1.5 : Thu May 08 2003 - 11:37:16 EDT