Re: Korean compression (was: Re: Ternary search trees for Unicode dictionaries)

From: John Cowan (
Date: Tue Nov 25 2003 - 08:23:32 EST

  • Next message: Michael Everson: "Re: Korean compression (was: Re: Ternary search trees for Unicode dictionaries)"

    Michael Everson scripsit:

    > Ridiculous. This happened centuries ago, and it is not "why" Ethiopic
    > was encoded as a syllabary. It was encoded as a syllabary because it
    > is a syllabary.

    Structurally it's an abugida, like Indic and UCAS.

    > You are, because the floodgates, while once open, have been closed by
    > normalization.

    Indeed, they were opened in Unicode 1.1, as a result of the merger with
    FDIS 10646; since then, only 46 characters with canonical decompositions
    have been added to Unicode (excepting compatibility ideographs, which
    are a special case).

    Specifically, 16 were added in Unicode 2.0, 29 in Unicode 1.0, and
    just one in Unicode 3.2 (the slashed version of a symbol added at the
    same time).

    "What has four pairs of pants, lives            John Cowan
    in Philadelphia, and it never rains   
    but it pours?"                        
            --Rufus T. Firefly            

    This archive was generated by hypermail 2.1.5 : Tue Nov 25 2003 - 09:12:41 EST