Re: Encoding Personal Use Ideographs (was Re: Level of Unicode support required for various languages)

From: Andrew West (andrewcwest@gmail.com)
Date: Sat Nov 03 2007 - 12:28:07 CST

  • Next message: James Kass: "Re: logos, symbols, and ligatures (RE: Encoding Personal Use Ideographs)"

    On 02/11/2007, James Kass <thunder-bird@earthlink.net> wrote:
    >
    > > The beauty of the ZWJ model (or evilness of the model, depending on
    > > your point of view) is that an A-ZWJ-B ligature may look exactly the
    > > same as a B-ZWJ-A ligature but would be treated as distinct entities.
    > > Thus, if someone wanted to create a ligature of U+9F8D 龍 long2
    > > "dragon" U+9580 門 men2 "gate" as cute way of writing Longmen 龍門
    > > "Dragon's Gate", with U+9F8D inside U+9580 they could do so with the
    > > sequence <U+9F8D U+200D U+9580> (representing the logical order of
    > > the ligatured characters). This would render the same as Ben's
    > > <U+9580 U+200D U+9F8D>, but would be treated differently by search
    > > engines, etc.
    >
    > Are you sure they would both render the same?

    No, I'm not. The font designer would need to know in advance what
    ligatures he was supporting and what he expected them to look like.
    But this isn't intended as a generic composition mechanism in the same
    way that IDS sequences can be, but a mechanism for dealing with
    specific kanji ligatures that may occasionally be needed.

    One advantage of using ZWJ over IDS is that the former mechanism may
    retain a semantic distinction between two identical glyphs. For
    example, if Ben Monroe's cousin Bill Romon (another venerable Scottish
    surname) decides that the Japanese form of his surname should also be
    <U+2FF5 U+9580 U+9F8D>, but read roumon rather than monrou, using ZWJ
    retains a distinction between the two names, <U+9580 U+200D U+9F8D>
    "Monroe" and <U+9F8D U+200D U+9580> "Romon". And even better, in the
    likely event that a font does not support the ligature the fallback
    display will be <U+9580 U+9F8D> 門龍 "mon rou" for Mr. Monroe and
    <U+9F8D U+9580> 龍門 "rou mon" for Mr. Romon.

    But this is all just idle chatter that I don't expect anyone to take
    too seriously.

    Andrew



    This archive was generated by hypermail 2.1.5 : Sat Nov 03 2007 - 12:30:23 CST