Re: Variation selectors and vowel marks

From: Ernest Cline (
Date: Sat Apr 24 2004 - 18:16:45 EDT

  • Next message: Peter Kirk: "Re: Variation selectors and vowel marks"

    > [Original Message]
    > From: Peter Kirk <>
    > On 24/04/2004 11:22, Ernest Cline wrote:
    > >...
    > >
    > >
    > >As someone who has put a lot of thought into variation selectors, let me
    > >point out something. In the case of B M1 M2 VS what would the variation
    > >selector indicating as being varied if such a thing were to be allowed?
    > >Since variation selectors are combining marks, then just like any other
    > >combining marks they should be viewed as being applied to the entire
    > >combining sequence up to that point, and hence should be viewed as
    > >indicating a variant of B M1 M2, and not of just the preceding mark.
    > >Any other treatment complicates things too much.
    > I always assumed that VS's are intended to apply to just the immediately
    > preceding character, and not to a whole combining character sequence. In
    > my opinion, "Any other treatment complicates things too much." But
    > perhaps there are others who can tell us what the UTC intended for this.

    Which is why as things currently stand, the standard calls for the only
    sequences to involve base characters only. To quote from Section 15.6:

    "The base character in a variation sequence is never a combining
    character or a decomposable character. The variation selectors
    themselves are combining marks of combining class 0 ..."

    In order to get Variation Selectors even able to be applied to
    other combining marks one would need to change the way
    Variation Selectors work, and doing that is what would complicate
    things too much.

    > >Thus in the case of the vowel marks, one could add a series of variation
    > >sequences with one for each base character that the variant vowel
    > >mark would be used with. If this causes too many other problems, ...
    > It would indeed if someone considers that every such combining sequence
    > has to be enumerated and defined individually. But if one simply says
    > that every combining sequence containing e.g. the sequence <QAMATS, VS1>
    > is legal and represents use of the variant qamats glyph, then there is
    > no problem.

    There are tons of problems once one adds in other combining marks
    being applied to the character as well, because then under normalization,
    unless the mark you were applying the variation selector to is of
    combining class 0, you can't assure that the variation selector will
    stay with the mark. Having the existing Variation Selectors behave
    in that way would break the normalization stability guarantee, so that
    can't be done, so you would need to introduce new Variation
    Selectors that would behave in this novel fashion.

    In order to do so, under the existing combining class framework you
    would need to add variation selectors with the same combining class
    as the mark it works with. An alternative would be to add yet another
    property for these new Variation Selectors so as to have it go outside
    the existing canonical combining class rules when it comes to
    canonical ordering.. Either way, it won't work properly with existing
    implementations, involves a lot more work than adding another
    vowel mark, and will not solve the problem of legacy data using the
    vowel mark for both the main version and its variant. I just don't
    see the benefits justifying the costs. If there were a number of use
    cases for doing this, it might justify the effort required, but for only
    a couple of vowel marks, I can't see it.

    This archive was generated by hypermail 2.1.5 : Sat Apr 24 2004 - 18:48:44 EDT