Re: Encoding Personal Use Ideographs (was Re: Level of Unicode support required for various languages)

From: Andrew West (andrewcwest@gmail.com)
Date: Sat Nov 03 2007 - 12:28:07 CST

Next message: James Kass: "Re: logos, symbols, and ligatures (RE: Encoding Personal Use Ideographs)"

Previous message: Michael Maxwell: "RE: Tamil Sri / Shri"
In reply to: James Kass: "Re: Encoding Personal Use Ideographs (was Re: Level of Unicode support required for various languages)"
Next in thread: vunzndi@vfemail.net: "Non-Han Han characters (was Re: Level of Unicode support required for various languages)"
Reply: vunzndi@vfemail.net: "Non-Han Han characters (was Re: Level of Unicode support required for various languages)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

On 02/11/2007, James Kass <thunder-bird@earthlink.net> wrote:
>
> > The beauty of the ZWJ model (or evilness of the model, depending on
> > your point of view) is that an A-ZWJ-B ligature may look exactly the
> > same as a B-ZWJ-A ligature but would be treated as distinct entities.
> > Thus, if someone wanted to create a ligature of U+9F8D 龍 long2
> > "dragon" U+9580 門 men2 "gate" as cute way of writing Longmen 龍門
> > "Dragon's Gate", with U+9F8D inside U+9580 they could do so with the
> > sequence <U+9F8D U+200D U+9580> (representing the logical order of
> > the ligatured characters). This would render the same as Ben's
> > <U+9580 U+200D U+9F8D>, but would be treated differently by search
> > engines, etc.
>
> Are you sure they would both render the same?

No, I'm not. The font designer would need to know in advance what
ligatures he was supporting and what he expected them to look like.
But this isn't intended as a generic composition mechanism in the same
way that IDS sequences can be, but a mechanism for dealing with
specific kanji ligatures that may occasionally be needed.

One advantage of using ZWJ over IDS is that the former mechanism may
retain a semantic distinction between two identical glyphs. For
example, if Ben Monroe's cousin Bill Romon (another venerable Scottish
surname) decides that the Japanese form of his surname should also be
<U+2FF5 U+9580 U+9F8D>, but read roumon rather than monrou, using ZWJ
retains a distinction between the two names, <U+9580 U+200D U+9F8D>
"Monroe" and <U+9F8D U+200D U+9580> "Romon". And even better, in the
likely event that a font does not support the ligature the fallback
display will be <U+9580 U+9F8D> 門龍 "mon rou" for Mr. Monroe and
<U+9F8D U+9580> 龍門 "rou mon" for Mr. Romon.

But this is all just idle chatter that I don't expect anyone to take
too seriously.

Andrew

Next message: James Kass: "Re: logos, symbols, and ligatures (RE: Encoding Personal Use Ideographs)"
Previous message: Michael Maxwell: "RE: Tamil Sri / Shri"
In reply to: James Kass: "Re: Encoding Personal Use Ideographs (was Re: Level of Unicode support required for various languages)"
Next in thread: vunzndi@vfemail.net: "Non-Han Han characters (was Re: Level of Unicode support required for various languages)"
Reply: vunzndi@vfemail.net: "Non-Han Han characters (was Re: Level of Unicode support required for various languages)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Sat Nov 03 2007 - 12:30:23 CST