Re: Named character sequneces and canonical equivalence, was: Cyrillic - accented/acuted vowels

From: Kenneth Whistler (kenw@sybase.com)
Date: Fri May 06 2005 - 21:20:09 CDT

  • Next message: Mustafa Jabbar: "RE: Alternatives to Adobe CS2 [Was: Unicode Support in Adobe CS2]"

    > >No. The specification should be clear.
    > >
    > >A Unicode Named Character Sequence is a specific sequence of
    > >code points associated with a name.
    > >
    > >
    > >
    > Does it have to be a sequence which is stable under all kinds of
    > canonical transformation?

    No, it has to be what it says it has to be: a sequence.

    > Or just under normalisation? Can it ever be a
    > sequence of a base character and a combining character (of combining
    > class greater than 1)? If it can, then there is always the possibility
    > that a combining character of lower combining class is also combined
    > with the same base character, which means that the sequence is not
    > stable under normalisation. But several of the examples given in UAX #34
    > are such sequences, which are not stable under normalisation. This is
    > the issue which Philippe was trying to address, as I understood it.

    No. Yes. So what. So what. So what. Respectively.

    Sorry to be glib here, but there is no reason for you and
    Philippe to take a simple thing that is what it says it is --
    a Unicode Named Character Sequence -- and start rerunning all
    the nightmare scenarios on it yet one more time.

    A Unicode Named Character Sequence is not some Platonic abstraction
    that needs to have some semantic identity associated with it under
    all conceivable contortions with format characters and combining
    marks in its vicinity.

    It is, quite simply:

    A Unicode code point sequence.

    And...

    A name.

    --Ken



    This archive was generated by hypermail 2.1.5 : Fri May 06 2005 - 21:21:26 CDT