Re: [hebrew] Re: Aramaic unification and information retrieval

From: Jungshik Shin (jshin@mailaps.org)
Date: Thu Dec 25 2003 - 04:46:16 EST

Next message: Philippe Verdy: "Re: Aramaic unification and information retrieval"

Previous message: Jungshik Shin: "Re: [hebrew] Re: Aramaic unification and information retrieval"
In reply to: Philippe Verdy: "Re: [hebrew] Re: Aramaic unification and information retrieval"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

On Thu, 25 Dec 2003, Philippe Verdy wrote:

> "Michael Everson" <everson@evertype.com>
> > We have encoded 70,000 of them.
>
> All depends on the way you define characters. Most ideographs are composed,
> but Unicode and the CJK unification working groups have failed for now to
> define a coherent definition of how these characters really compose, so we

Is there a well-defined set of components that can unambiguously
compose 'up' all Chinese characters that can be agreed upon by all
interested people (or at least national standard bodies)?

> are still assisting to an always exploding number of compound ideographs,
> created everyday by Han users.
>
> If Latin characters were counted the way Han is, we would probably reach
> similar (may be even more) composed "characters". It's just infortunate that
> Han lacks a way to describe its composition model

You're trying to bridge the enormous gulf between alphabetic scripts
like Latin, Greek, Cyrillic on the one hand and 'isolating' (it's my own
word) script like Chinese. It's true that Chinese characters have some
'phonetic' characteristics, but one of important things to consider is
how Chinese characters have been perceived by their users.

It might be helpful to 'decompose' / 'disassemble' Chinese characters
into smaller components when you design fonts (Bitstream or some other
foundries tried this and somebody at Stanford also had a web page on
this), but I don't think there's anything fundamentally wrong with the
current 'encoding' model of Chinese characters in Unicode/10646.

> (it used to be the case too for the Hangul Alphabet,

When was that the case? It never was.

> but recent works seem to demonstrate that the
> complexity of Hangul is just superficial in Unicode but forgets the actual

"recent" works of whom ??? Philippe, it may be recent and new to you
(I'm sorry to say, but a lot of things you've been 'discovering' are
common knowledge to Koreans and a lot of others) but it's been that way
since the invention of Korean script in 1443.

Jungshik

Next message: Philippe Verdy: "Re: Aramaic unification and information retrieval"
Previous message: Jungshik Shin: "Re: [hebrew] Re: Aramaic unification and information retrieval"
In reply to: Philippe Verdy: "Re: [hebrew] Re: Aramaic unification and information retrieval"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Thu Dec 25 2003 - 05:41:13 EST