Re: [hebrew] Re: Aramaic unification and information retrieval

From: Jungshik Shin (jshin@mailaps.org)
Date: Thu Dec 25 2003 - 04:46:16 EST

  • Next message: Philippe Verdy: "Re: Aramaic unification and information retrieval"

    On Thu, 25 Dec 2003, Philippe Verdy wrote:

    > "Michael Everson" <everson@evertype.com>
    > > We have encoded 70,000 of them.
    >
    > All depends on the way you define characters. Most ideographs are composed,
    > but Unicode and the CJK unification working groups have failed for now to
    > define a coherent definition of how these characters really compose, so we

      Is there a well-defined set of components that can unambiguously
    compose 'up' all Chinese characters that can be agreed upon by all
    interested people (or at least national standard bodies)?

    > are still assisting to an always exploding number of compound ideographs,
    > created everyday by Han users.
    >
    > If Latin characters were counted the way Han is, we would probably reach
    > similar (may be even more) composed "characters". It's just infortunate that
    > Han lacks a way to describe its composition model

      You're trying to bridge the enormous gulf between alphabetic scripts
    like Latin, Greek, Cyrillic on the one hand and 'isolating' (it's my own
    word) script like Chinese. It's true that Chinese characters have some
    'phonetic' characteristics, but one of important things to consider is
    how Chinese characters have been perceived by their users.

     It might be helpful to 'decompose' / 'disassemble' Chinese characters
    into smaller components when you design fonts (Bitstream or some other
    foundries tried this and somebody at Stanford also had a web page on
    this), but I don't think there's anything fundamentally wrong with the
    current 'encoding' model of Chinese characters in Unicode/10646.

    > (it used to be the case too for the Hangul Alphabet,

      When was that the case? It never was.

    > but recent works seem to demonstrate that the
    > complexity of Hangul is just superficial in Unicode but forgets the actual

      "recent" works of whom ??? Philippe, it may be recent and new to you
    (I'm sorry to say, but a lot of things you've been 'discovering' are
    common knowledge to Koreans and a lot of others) but it's been that way
    since the invention of Korean script in 1443.

      Jungshik



    This archive was generated by hypermail 2.1.5 : Thu Dec 25 2003 - 05:41:13 EST