Re: Variation selectors and vowel marks

From: Ernest Cline (
Date: Sat Apr 24 2004 - 14:22:33 EDT

  • Next message: Michael Everson: "New contributions n2743 and n2744"

    From: Peter Kirk <>
    > On 24/04/2004 06:30, Peter Constable wrote:
    > >
    > >If data is always encoded in canonical order, then having a VS within
    > >the combining mark sequence wouldn't create any normalization problems,
    > >that's true. But you well know that people do not want their Hebrew data
    > >in canonical order. Even if they did, it couldn't be guaranteed.
    > Yes, canonical ordering cannot be guaranteed. But ordering rules can be
    > specified, and departures from them treated as spelling errors.
    > >There's a problem not only in cases of the form B M1 M2 VS, but also in
    > >cases of the form B M1 VS M2. Of course, the issues are different. The
    > >first may normalize to B M2 M1 VS; the second perhaps *ought* to
    > >normalize to B M2 M1 VS, but that won't happen.
    > Well, perhaps the best thing here is to specify that the mark to which
    > the VS applies should always come first after the base character and
    > followed by the VS, irrespective of the normal canonical order. At least
    > that would be unambiguous, and stable under normalisation (since the
    > only relevant precomposed characters are composition exceptions).
    > Other orderings should simply be considered spelling errors.

    As someone who has put a lot of thought into variation selectors, let me
    point out something. In the case of B M1 M2 VS what would the variation
    selector indicating as being varied if such a thing were to be allowed?
    Since variation selectors are combining marks, then just like any other
    combining marks they should be viewed as being applied to the entire
    combining sequence up to that point, and hence should be viewed as
    indicating a variant of B M1 M2, and not of just the preceding mark.
    Any other treatment complicates things too much.

    Thus in the case of the vowel marks, one could add a series of variation
    sequences with one for each base character that the variant vowel
    mark would be used with. If this causes too many other problems,
    then adding a new mark for the vowel variant instead of trying to adapt
    variation selectors to the task would seem to be the best solution.

    This archive was generated by hypermail 2.1.5 : Sat Apr 24 2004 - 14:52:27 EDT