Re: Still can't work out whats a "canonical decomp" vs a "compatibility decomp"

From: Michael \(michka\) Kaplan (michka@trigeminal.com)
Date: Thu May 08 2003 - 10:26:08 EDT

Next message: William Overington: "Re: Quest text font now available."

Previous message: Marco Cimarosti: "RE: Still can't work out whats a "canonical decomp" vs a "compat ibility decomp""
In reply to: jarkko.hietaniemi@nokia.com: "RE: Still can't work out whats a "canonical decomp" vs a "compatibility decomp""
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

> > I suppose there is always Bytext?
>
> Hey, I was trying to be serious...

So was I, in reverse. Here is the long form of the critical comment I made.
:-)

-------------------------------------------
The simple fact is that any attempt to take a complex system and either
simplify it or make it more intuitive will only serve to make it more
complex and harder to understand, in the long run. To wit:

1) The whole notion of encoding the same thing multiple times exists for the
sake of making it easier to be compatible with legacy encodings (as long as
one ignores how confusing it is to have more than one way to say the same
thing).

2) The whole notion of decompositions is to make it easier to handle #1 (as
long as one ignores the obvious complexity and the conseqences of a
mistake).

3) The whole notion of compatibility vs. canonical decompositions is to try
to distinguish between two different classes of decompositions that need to
be treated two different ways (again one has to ignore the increased
complexity and those mistakes again).

4) The whole notion of normalization is to give a framework for implementers
of Unicode to be able to define exactly what the text should be, given all
of the above (as long as one ignores the time limit this places on getting
the data in and of course the nightmare of different forms, more of whcih
are created by implementers all the time!).

5) The whole notion of the stability guarantees is to allow implementers to
trust what Unicode sends out into the void (as long as one ignores the fact
that it guves teeth to the bite of mistakes in #1-#4).

6) The whole notion of new properties in the UCD is to try to make certain
operations easier (as long as one ignores the obvious complexity increase
and the normativity problems again)

7) The whole notion of the glossary is to try to define the myriad of terms
(as long as one ignores the size of this thing -- it maye be on track to be
longer than the book by Unicode 6.0 at the rate we are going! <grin>)

8) Sometimes it seems like the whole notion of this list (or at least its
most useful purpose!) is to point all of the various problems and issues
that come up due to the imperfect nature of each of the above items -- every
single one of which served to make the standard more complex, despite the
goal of trying to make it easier. :-)

8) Bytext is a good example of a new kind of non-solution -- throw it all
away and start over with a whole new set of issues and problems, plus
inherit others for the sake of compatibiltiy. Everyone does have a right to
be wrong -- we have been often enough, so why begrudge the rest of the
world?

We may serve everyone better to just doc what it is and work on improving
*that* and not to keep coming up with yet another property or yet another
mechanism to make everyone's life easier. Because those kinds of things
never work... :-)

MichKa

Next message: William Overington: "Re: Quest text font now available."
Previous message: Marco Cimarosti: "RE: Still can't work out whats a "canonical decomp" vs a "compat ibility decomp""
In reply to: jarkko.hietaniemi@nokia.com: "RE: Still can't work out whats a "canonical decomp" vs a "compatibility decomp""
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Thu May 08 2003 - 11:37:16 EDT