RE: Unicode 5.1, Egyptian Transliteration, and Fonts

From: Philippe Verdy (
Date: Sun Dec 02 2007 - 17:29:11 CST

  • Next message: Benjamin M Scarborough: "Katakana Extended-A?"

    Benjamin M Scarborough wrote:
    > Christopher Fynn wrote:
    > >Personally I feel combining marks which cross script boundaries are not
    > a very
    > >good idea.
    > Unfortunately, it's too late to avoid this. I doubt that using U+0486
    > with Latin letters will complicate things that much further.

    It's true that mixing scripts for BASE letters would still be a bad idea.
    But combining marks are already largely shared between Latin, Cyrillic,
    Glagolitic, Greek, Coptic, plus IPA alphabetic scripts since long. (I think
    they may even be used in combination with Georgian scripts, or Armenian, if
    there are application for them).

    On the opposite, the combining marks for other non alphabetic scripts (and
    notably with right-to-left abjads and syllabaries) should still remain in
    their own script, due to their unique complex behaviour in those scripts.

    The question remains open for ideographic and the simple Japanese
    syllabaries, that seem to use quite freely the combining marks for symbols
    (arrows, enclosing squares and circles...), but apparently don't use a lot
    the general combining marks made for alphabetic scripts because they are too
    confusable with CJK strokes (for example Hiragana and Katakana share their
    voicing marks, but don't reuse the similar-looking accents defined for
    alphabetic scripts, and they place them differently).

    For delimited syllabaries like UCAS or Cherokee (that share lot of graphical
    characteristics borrowed from alphabetic scripts from which they were
    largely built without maintaining their semantic), these diacritics could be
    used quite safely as well. Not sure if this should apply to Ethiopic...

    The generic combining marks (U+0300 and further) should really have been
    documented with some restriction allowing their use only in simple scripts,
    ad suggesting instead to candidate users to perform at least some
    information request to the UTC or ISO WG2, possibly suggesting separate
    encoding before using them more broadly ; there will still be some examples
    showing the opposite, but this should not be a long term solution...

    This archive was generated by hypermail 2.1.5 : Sun Dec 02 2007 - 17:49:34 CST