Re: compatibility characters (in XML context)

From: John Cowan (cowan@mercury.ccil.org)
Date: Fri Nov 14 2003 - 19:52:54 EST

  • Next message: Murray Sargent: "RE: How can I input any Unicode character if I know its hexadecimal code?"

    Kenneth Whistler scripsit:

    > However, there were character encoding standards committees,
    > predating the UTC, which did not understand this principle,
    > and which encoded a character for the Ångstrom sign as a
    > separate symbol. In most cases this would not be a problem,
    > but in at least one East Asian encoding, an Ångstrom sign
    > was encoded separately from {an uppercase Å of the Latin script},
    > resulting in two encodings for what really is the same thing,
    > from a character encoding perspective.

    But IIRC they did so in two separate character encoding standards
    which the UTC for reasons of its own decided to treat as one standard.

    > Note that there a also piles of "compability characters" in
    > Unicode which have no decomposition mapping whatsoever,
    > and which thus are completely unimpacted by normalization.

    If someone undertook to prepare a draft list of these, would the
    UTC consider blessing it, in corrected form? It is disconcerting
    that the notion "compatibility character" is so fuzzily defined.

    -- 
    John Cowan  jcowan@reutershealth.com  www.reutershealth.com  ccil.org/~cowan
    Dievas dave dantis; Dievas duos duonos          --Lithuanian proverb
    Deus dedit dentes; deus dabit panem             --Latin version thereof
    Deity donated dentition;
      deity'll donate doughnuts                     --English version by Muke Tever
    God gave gums; God'll give granary              --Version by Mat McVeagh
    


    This archive was generated by hypermail 2.1.5 : Fri Nov 14 2003 - 20:31:05 EST