Re: Still can't work out whats a "canonical decomp" vs a "compatibility decomp"

From: John Cowan (
Date: Wed May 07 2003 - 15:17:31 EDT

  • Next message: Stefan Persson: "Re: variants and code-page --> unicode conversion"

    Kenneth Whistler scripsit:

    > Although perhaps John Cowan might be persuaded to come up with
    > the pocket edition explanation, comparable to his famous
    > list of Unicode conformance requirements:

    Well, since you ask (I already sent a somewhat longer version of this by
    private mail):

    Q: What's the difference between canonical and compatibility decomposition?

    A: Replacing a character by its canonical decomposition, which is either
    one or two characters long, does not destroy information, and makes no
    practical difference for most purposes.

    Replacing a character by its compatibility decomposition, which may be
    of any length, does destroy information, but typically transforms the
    character into better-known characters that may be easier to process.

    He made the Legislature meet at one-horse       John Cowan
    tank-towns out in the alfalfa belt, so that
    hardly nobody could get there and most of
    the leaders would stay home and let him go
    to work and do things as he pleased.    --Mencken, _Declaration of Independence_

    This archive was generated by hypermail 2.1.5 : Wed May 07 2003 - 16:13:51 EDT