Re: Encoding Personal Use Ideographs (was Re: Level of Unicode support required for various languages)

From: Andrew West (andrewcwest@gmail.com)
Date: Thu Nov 01 2007 - 08:41:45 CST

Next message: Murray Sargent: "RE: Stix beta fonts released"

Previous message: Ed Trager: "Re: Encoding Personal Use Ideographs (was Re: Level of Unicode support required for various languages)"
In reply to: vunzndi@vfemail.net: "Re: Encoding Personal Use Ideographs (was Re: Level of Unicode support required for various languages)"
Next in thread: John H. Jenkins: "Re: Encoding Personal Use Ideographs (was Re: Level of Unicode support required for various languages)"
Reply: John H. Jenkins: "Re: Encoding Personal Use Ideographs (was Re: Level of Unicode support required for various languages)"
Reply: James Kass: "Re: Encoding Personal Use Ideographs (was Re: Level of Unicode support required for various languages)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

On 01/11/2007, vunzndi@vfemail.net <vunzndi@vfemail.net> wrote:
>
> Most characters are like your name,

On the other hand most ideographic characters are not at all like the
character for Ben's name, which is a not a single abstract character,
but is really just a ligature
<http://ja.wikipedia.org/wiki/%E5%90%88%E5%AD%97> of its two
components, U+9580 門 "mon" plus U+9F8D 龍 "rou". It has no meaning in
itself other than being a phonetic representation of "Monroe".

Such ligatures are very rare in Chinese (the typical example is U+74E9
瓩 qian1wa3 "kilowatt"), but more common in Japanese -- the name of
Kitagawa Utamaro 喜多川歌麿 comes to mind, where U+9EBF 麿 maro is a
ligature of the characters U+9EBB 麻 and U+5415 吕 (I am probably wrong
on this, but from a quick google it seems that he may have been the
first person to join the two characters into one, and in earlier times
the two components of the character were written separately).

If you were going to ask me what the "best" way to represent kanji
ligatures such as <U+2FF5 U+9580 U+9F8D> would be under an ideal
Unicode model, I would say as <U+9580 U+200D U+9F8D>, using ZWJ to
indicate the ligation, and smart fonts would ligate the two components
into a single glyph if they could. But back in the real world, the
approach taken has been to encode kanji ligatures (and kana ligatures,
and even kana/kanji hybrid ligatures) as separate characters, so Ben's
character is a potential candidate for encoding.

Andrew

Next message: Murray Sargent: "RE: Stix beta fonts released"
Previous message: Ed Trager: "Re: Encoding Personal Use Ideographs (was Re: Level of Unicode support required for various languages)"
In reply to: vunzndi@vfemail.net: "Re: Encoding Personal Use Ideographs (was Re: Level of Unicode support required for various languages)"
Next in thread: John H. Jenkins: "Re: Encoding Personal Use Ideographs (was Re: Level of Unicode support required for various languages)"
Reply: John H. Jenkins: "Re: Encoding Personal Use Ideographs (was Re: Level of Unicode support required for various languages)"
Reply: James Kass: "Re: Encoding Personal Use Ideographs (was Re: Level of Unicode support required for various languages)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Thu Nov 01 2007 - 08:44:09 CST