Re: Surrogate points

From: Hans Aberg (
Date: Sun Jan 30 2005 - 13:14:10 CST

  • Next message: Peter Constable: "RE: Surrogate points"

    At 18:24 +0000 2005/01/29, Jon Hanna wrote:

    >Hey, the surrogates aren't even the most illogical part of the model by a
    >long shot, the characters with single character canonical decompositions are
    >much worse.

    The numbers 0xD800-0xDFFF, 0xFFFE-0xFFFF are not associated with character,
    but included as place holders, never to be used, because one has failed to
    give the encoding UTF-16 a proper design. So an unrelated problem, choice of
    character encoding, is allowed to influence the logical core, the character
    set description.

    The other problem you mention is clearly a problem of describing character
    properties. So, no matter how complicated, it belongs to the character set
    description. Mathematically, though, one just defines an equivalence
    relation on the set of character sequences with a preferred equivalence
    class representative.

      Hans Aberg

    This archive was generated by hypermail 2.1.5 : Sun Jan 30 2005 - 13:17:47 CST