RE: Regarding binary combining accents and grouping characters

From: Philippe Verdy (
Date: Sat Jan 12 2008 - 03:38:42 CST

  • Next message: Leo Broukhis: "Character proposal: SUBSCRIPT TEN"

    > -----Message d'origine-----
    > De : [] De la
    > part de James Kass
    > Envoyé : mercredi 9 janvier 2008 19:10
    > À : Thomas Abraham;
    > Objet : Re: Regarding binary combining accents and grouping characters
    > Thomas Abraham wrote,
    > > 1. Is it possible to implement
    > > * a generic subscript/superscript combining accent,
    > > * and/or a generic division combining accent
    > > in unicode?
    > >
    > > 2. Does the above problem comes under the scope of Unicode?
    > >
    > > So far almost all the combining accent we have seen across are
    > unary.
    > > They combine to exactly one character on its left side or right side.
    > > ...
    > Can you send examples showing what you seek to encode/display?

    I think he wants to reuse any existing base character to transform it into
    exponents or indice or into characters combining at various positions around
    another character. Such operation would be layout or style feature, and
    would otherwise compromise the unification of existing diacritics.

    I'm not sure we really need this as a generic feature. And anyway the
    proposed solution will also break the character encoding model (think about
    the line/sentence breaking properties, grapheme cluster boundaries and many
    new ambiguities or new confusables that will expose to new security

    We already faced the problem with interlinear annotations (which I consider
    as a break in the text encoding model, and a wrong solution performed at the
    wrong level).

    But it's true that it would really help if there were new guidelines for
    implementing upper level protocols, above Unicode, to ensure their
    interoperability. This will be needed anyway for some scripts that won't be
    completely encoded for correctly rendering legible text.

    New guidelines (and possibly some joint workshops with other standard bodies
    already working on upper level protocols depending on Unicode/ISO/IEC 10646
    encoding for basic plain text encoding) will be needed with different
    solutions for each script, to allow some of them to eventually evolve later
    to some standard annex (like the BiDi algorithm) after some experience is
    accumulated, when handling complex text layouts.

    This archive was generated by hypermail 2.1.5 : Sat Jan 12 2008 - 10:05:21 CST