RE: Compression through normalization

From: Kent Karlsson (kentk@cs.chalmers.se)
Date: Thu Dec 04 2003 - 06:40:02 EST

  • Next message: Philippe Verdy: "RE: Compression through normalization"

    Philippe Verdy wrote:

    > I just have another question for Korean: many jamos are in fact
    > composed from other jamos: this is clearly visible both in their name
    > and in their composed glyph. What would be the linguistic impact of
    > decomposing them (not canonically!)? Do Korean really learn these
    > jamos without breaking them into their components? I think here
    > about SSANG (double) consonnants, or the initial Y or final E of some vowels...
    > Of couse I won't be able to use such decomposition in Unicode,

    Of course you, and anyone else, can. Just as well as one can use spell
    checkers/correctors, transform digits between scripts, do transcriptions,
    or any other kind of processing on Unicode texts. It cannot be part of
    normalisation, though. And I agree that in this case that is unfortunate,
    since the letter cluster jamos really consist of sequences of two or more
    letters each. Fortunately, the definition of Hangul syllable blocks need
    not be changed, as it works well with Hangul syllables as L+, V+, T*
    (where L, V, and T stand for single-letter jamos).

                    /kent k



    This archive was generated by hypermail 2.1.5 : Thu Dec 04 2003 - 07:44:41 EST