Re: Variation selectors and vowel marks

From: Asmus Freytag (
Date: Sat Apr 24 2004 - 19:30:14 EDT

  • Next message: Mark Davis: "Re: Common Locale Data Repository Project"

    At 04:00 PM 4/24/2004, Peter Kirk wrote:
    >>There are tons of problems once one adds in other combining marks
    >>being applied to the character as well, because then under normalization,
    >>unless the mark you were applying the variation selector to is of
    >>combining class 0, you can't assure that the variation selector will
    >>stay with the mark. Having the existing Variation Selectors behave
    >>in that way would break the normalization stability guarantee, ...
    >This is untrue. Normalisation stability does not apply when the text is
    >changed, and inserting a variation selector is a change to the text. I
    >have never suggested changing the combining class or other normalisation
    >properties of existing VSs. The way to ensure that a VS stays with the
    >mark it applies to is to ensure that in the part of the combining
    >character sequence before the VS all combining characters are already in
    >canonical order. Well, I can see that there are potential problems where
    >there are canonical decompositions (which are not composition exclusions),
    >but that does not apply to the cases I am interested in.

    Because of normalization stability, the combining class of all existing
    variation selectors must remain at 0. A character of class 0 interrupts
    canonical reordering (so that, for example, accent marks inside and outside
    an enclosing mark don't switch places).

    Unnormalized data is perfectly legal in Unicode and *must* and just as
    equivalent to normalized data as the composed and decomposed normalized
    forms are to each other. [The rules are for that are in chapter 3].

    Therefore, any scheme that only works if data is always normalized is not

    You can dream of new types of characters, which have different combining
    classes, but then, by your own admission in another part of this thread,
    you would be forced to add new characters.

    We have purposefully added a large number of variation selectors so that
    software can be built today that robustly covers all those processes where
    the variation selectors can be ignored. As I pointed out in my last
    message, it's a defining characteristic of variation selectors that there
    are many processes for which they should be ignored.

    Because of that,it would be *much* easier to even add 6 1/2 dozen new
    combining characters than a single 'specialized' new type of variation


    This archive was generated by hypermail 2.1.5 : Sat Apr 24 2004 - 20:00:09 EDT