no more precomposed characters for 1:1 conversion

From: Markus Scherer (markus.scherer@jtcsv.com)
Date: Mon Dec 01 2003 - 17:36:25 EST

  • Next message: Frank Yung-Fong Tang: "Re: creating a test font w/ CJKV Extension B characters."

    I would like to point out one of the new features of ICU 2.8, which is currently available as an
    alpha release: http://oss.software.ibm.com/icu/download/2.8/

    ICU 2.8 has the ability to handle m:n character conversion mappings driven by simple lines in
    Unicode conversion tables (text files).

    I sincerely hope that the availability of this feature will help argue against further assignments
    of precomposed Unicode characters.

    For example, the ibm-1390_P110-2003.ucm conversion table file (for EBCDIC Japanese with the JIS X
    0213 repertoire) contains lines like

    <U304B><U309A> \xEC\xB5 |0

    which expresses the mapping between two Unicode code points (Hiragana Ka + semi-voiced mark) and one
    DBCS sequence.

    Either side of the mapping can contain multiple "characters" - Unicode code points on one side,
    complete codepage byte sequences on the other.

    Best regards,
    markus



    This archive was generated by hypermail 2.1.5 : Mon Dec 01 2003 - 18:16:54 EST