Re: Korean compression (was: Re: Ternary search trees for Unicode dictionaries)

From: Doug Ewell (
Date: Tue Nov 25 2003 - 13:46:06 EST

  • Next message: Doug Ewell: "Re: Normalisation stability, was: Compression through normalization"

    Philippe Verdy <verdy underscore p at wanadoo dot fr> wrote:

    > The question of Latin letters with two diacritics added in Latin
    > Extension B does not seem to respect this constraint, as it is not
    > justifed in the Vietnames VISCII standard that already does not
    > contain characters with two diacritics, but already composes them
    > with two characters in the limited CCS set.

    Not true. If you like, I can send you a copy of the VISCII report
    showing not only the mappings, but also their justification. The
    Viet-Std organization went to great lengths to avoid combining
    characters, even, as John said, to the point of encoding six graphic
    characters in the C-zero control area.

    Perhaps you are thinking of Windows code page 1258, which includes many
    precomposed letters, but none in the Latin Extension B block, and does
    require combining marks for vowels with two diacritics.

    -Doug Ewell
     Fullerton, California

    This archive was generated by hypermail 2.1.5 : Tue Nov 25 2003 - 14:42:54 EST