Re: *Complete* Big5 to Unicode mappings

From: Markus Scherer (
Date: Mon Apr 21 2003 - 11:52:17 EDT

  • Next message: Elliotte Rusty Harold: "RE: *Complete* Big5 to Unicode mappings"

    John Cowan wrote:
    > Elliotte Rusty Harold scripsit:
    >>Is there anywhere I can find or piece together a *complete* list of
    >>Unicode characters that are available in Big5 (and other similar
    >>sets)? ...

    If you want static data on this, then the various sites with mapping tables are a good source.

    If you want this at runtime, ICU 2.6 will have an API ucnv_getUnicodeSet() that gives you a set of
    all Unicode code points that the specified ICU converter roundtrips. (It's fully implemented in the
    current CVS snapshot.)

    Of course, this is based on the particular mapping table that is loaded for the converter (mostly
    IBM tables in the default build), but you can build your own tables into ICU if you like.

    For an online example, see the ICU Converter Explorer at
    -> select a converter name, then on the new page click "View Complete Set..."

    > I look forward to the excellent small and fast code you will design
    > for representing the list of valid characters in Big5 and other large
    > character sets....

    ICU has its UnicodeSet C++/Java class for all kinds of sets of code points...
    (The above C API fills a C USet which is a thin C wrapper around C++ UnicodeSet.)


    Opinions expressed here may not reflect my company's positions unless otherwise noted.

    This archive was generated by hypermail 2.1.5 : Mon Apr 21 2003 - 12:40:49 EDT