Re: Encoding Personal Use Ideographs (was Re: Level of Unicode support required for various languages)

From: vunzndi@vfemail.net
Date: Fri Nov 02 2007 - 06:22:56 CST

Next message: Andrew West: "Re: Encoding Personal Use Ideographs (was Re: Level of Unicode support required for various languages)"

Previous message: vunzndi@vfemail.net: "Re: Encoding Personal Use Ideographs (was Re: Level of Unicode support required for various languages)"
In reply to: Andrew West: "Re: Encoding Personal Use Ideographs (was Re: Level of Unicode support required for various languages)"
Next in thread: Andrew West: "Re: Encoding Personal Use Ideographs (was Re: Level of Unicode support required for various languages)"
Reply: Andrew West: "Re: Encoding Personal Use Ideographs (was Re: Level of Unicode support required for various languages)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Quoting Andrew West <andrewcwest@gmail.com>:

> On 01/11/2007, John H. Jenkins <jenkins@apple.com> wrote:
>>
>> > If you were going to ask me what the "best" way to represent kanji
>> > ligatures such as <U+2FF5 U+9580 U+9F8D> would be under an ideal
>> > Unicode model, I would say as <U+9580 U+200D U+9F8D>, using ZWJ to
>> > indicate the ligation, and smart fonts would ligate the two components
>> > into a single glyph if they could.
>>
>> Actually, do it without the ZWJ, which would break the IDS syntax.
>> Just make the ligature on by default.
>
> To clarify, in my ideal world IDS sequences would not be composable
> into a single glyph by fonts, but would always be rendered as a
> sequence of IDC and ideographic characters. I would use ZWJ for
> hanzi/kanji ligation without any IDC characters. The obvious
> disadvantage to this is that it does give the font any clues as to
> what the character should look like, but that is true for all scripts
> that have ligatures. In the case of simple kanji ligatures the
> resultant glyph is usually self-evident, but in any case font
> designers would probably have to know which particular kanji ligatures
> they wanted to support in the first place.
>
> The beauty of the ZWJ model (or evilness of the model, depending on
> your point of view) is that an A-ZWJ-B ligature may look exactly the
> same as a B-ZWJ-A ligature but would be treated as distinct entities.
> Thus, if someone wanted to create a ligature of U+9F8D ? long2
> "dragon" U+9580 ? men2 "gate" as cute way of writing Longmen ??
> "Dragon's Gate", with U+9F8D inside U+9580 they could do so with the
> sequence <U+9F8D U+200D U+9580> (representing the logical order of
> the ligatured characters). This would render the same as Ben's
> <U+9580 U+200D U+9F8D>, but would be treated differently by search
> engines, etc.
>

Yes though the question is of course what is obvious cf

U+9584 閄
U-00021B89 𡮉

> Incidentally, if Ben does want to find evidence for <U+2FF5 U+9580
> U+9F8D> that will satisfy UTC and WG2 then my suggestion is that he
> trawls through the corpus of literature relating to the Longmen
> Grottoes <http://en.wikipedia.org/wiki/Longmen_Grottoes> and ancient
> descriptions of walled cities with gates named Longmen -- I'm sure
> that someone sometime somewhere must have already created the
> character as a shorthand for <U+9F8D U+9580>. The thing that really
> surprises me is that it is not already encoded, when we have
> characters such as:
>
>

There are literally thousand, tens of thousands of very simple
characters not encoded. The simplest ones I can think of have only
four strokes to them. The most well known being the rectangle with a
vertical line; and my favourite consists of U+5B50 子 U+529B
力, Zhuang lwg meaning child.

Can anyone think of a 3 stroke character that is on the list of to be encoded?

The point of the above being if even fairly common four stroke
characters are yet to be enoced there should be no suprize that
<U+9F8D U+9580> has not.

John

-------------------------------------------------
This message sent through Virus Free Email
http://www.vfemail.net

Next message: Andrew West: "Re: Encoding Personal Use Ideographs (was Re: Level of Unicode support required for various languages)"
Previous message: vunzndi@vfemail.net: "Re: Encoding Personal Use Ideographs (was Re: Level of Unicode support required for various languages)"
In reply to: Andrew West: "Re: Encoding Personal Use Ideographs (was Re: Level of Unicode support required for various languages)"
Next in thread: Andrew West: "Re: Encoding Personal Use Ideographs (was Re: Level of Unicode support required for various languages)"
Reply: Andrew West: "Re: Encoding Personal Use Ideographs (was Re: Level of Unicode support required for various languages)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Fri Nov 02 2007 - 06:25:28 CST