Re: Nicest UTF

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Fri Dec 10 2004 - 16:37:42 CST

  • Next message: Marcin 'Qrczak' Kowalczyk: "Re: Nicest UTF"

    ----- Original Message -----
    From: "Marcin 'Qrczak' Kowalczyk" <qrczak@knm.org.pl>
    To: <unicode@unicode.org>
    Sent: Friday, December 10, 2004 8:35 PM
    Subject: Re: Nicest UTF

    > "Philippe Verdy" <verdy_p@wanadoo.fr> writes:
    >
    >> The XML/HTML core syntax is defined with fixed behavior of some
    >> individual characters like '&', '<', quotation marks, and with special
    >> behavior for spaces.
    >
    > The point is: what "characters" mean in this sentence. Code points?
    > Combining character sequences? Something else?

    See the XML character model document... XML ignores combining sequences. But
    for Unicode and for XML a character is an abstract character with a single
    code allocated in a *finite* repertoire. The repertoire of all possible
    combining characters sequences is already infinite in Unicode, as well as
    the number of "default grapheme clusters" they can represent.



    This archive was generated by hypermail 2.1.5 : Fri Dec 10 2004 - 16:38:31 CST