Re: Combining Triple Diacritics (N3915) not accepted by UTC #125

From: Benjamin M Scarborough (
Date: Sat Nov 13 2010 - 03:19:55 CST

  • Next message: Jim Monty: "Application that displays CJK text in Normalization Form D"

    I believe that the key to getting these characters encoded is
    establishing that there is a vital semantic importance to the character
    that is lost if it is stripped away. This is the grounds for the
    Mathematical Alphanumeric Symbols block.

    Unfortunately, figures 1 and 2 from JTC1/SC2/WG2 N3915 actually provide
    a reason -against- encoding. The meaning of the diacritic in these two
    examples is that the transliterated letters were ligated in the
    original text. In this usage, the mark can span any arbitrary number of
    letters; indeed, figure 2 shows the mark in question spanning four
    letters. This makes it a much better candidate for use in higher-level
    markup than a set of combining marks.

    Figures 3 and 4 present a better case and show a stronger need for some
    combining triple diacritic. I notice that all seven examples between
    the two figures represent what would normally be two letters with a
    double diacritic, but some modifier symbol intervenes and stretches the
    tie to span three. However, proposing the triple diacritics used this
    way would require proof that the sequence of letters with the diacritic
    has some important difference from the same sequence of letters
    without, which N3915 fails to establish.

    In any event, I happen to know that there is in some phonetic
    transcription system an "sch" with breve below. It is used to represent
    [ʒ], which contrasts with the unmarked sch used to represent [ʃ]. This
    is a clear semantic distinction, and so the sch with breve below should
    be encoded in some fashion, either as a sequence of characters or some
    fully composed one.

    --Ben Scarborough

    This archive was generated by hypermail 2.1.5 : Sat Nov 13 2010 - 03:27:29 CST