Re: Variation selectors and vowel marks

From: Asmus Freytag (
Date: Sat Apr 24 2004 - 04:00:11 EDT

  • Next message: Peter Kirk: "Re: Variation selectors and vowel marks"

    At 08:59 AM 4/23/2004, wrote:
    >I'm surfacing an issue from because it may have
    >wider applicability.
    >Currently, it's the rule that variation selector characters can't be
    >applied to combining characters. This is sensible in the case of true
    >diacritical marks: if two marks differ in shape, they ought in general
    >to be encoded separately, since marks are primarily shape-based rather
    >than functional in the first place.
    >It's not so clear, however, that this rule is appropriate when applied
    >to vowel signs. If some textual traditions represent vowels a and a'
    >differently whereas others unify them, and if the variation is not
    >predictable at the Unicode level, then it would seem appropriate to
    >provide a vowel sign for a and define a VSS <a, VS1> to represent a'
    >in those textual traditions in which it is in use. The alternative of
    >providing a distinct vowel sign a' and treating the difference as one of
    >spelling impacts backward compatibility and burdens textual processes.

    The rationale for making a variation selector ineligible to apply to
    combining marks comes from normalization. In normalization all combining
    character sequences are put in their canonical order, based on their
    combining class. If we want to allow accents to be placed on base
    characters to which a variation selector has been applied, then the
    variation selector needs to come in between the base character and the
    accent. Unless we make it a combining character, the VS would interrupt the
    combining character sequence (separating the accent from the base
    character). But if it is a a combining character, it takes part in the
    reordering. Therefore, it needs to have combining class 0, so it stays with
    the base character. And that's the reason it cannot be used to apply to
    characters whose combining class is not 0.

    So you see, the rule is not based on the linguistic nature of the combining
    character, but on its combining class.

    A./ (Normalization)

    This archive was generated by hypermail 2.1.5 : Sat Apr 24 2004 - 04:37:03 EDT