Re: Converting EBCDIC to Unicode

From: Markus Scherer (markus.scherer@jtcsv.com)
Date: Tue Feb 11 2003 - 11:46:32 EST

  • Next message: Carl W. Brown: "RE: Converting EBCDIC to Unicode"

    Doug Ewell wrote:
    > SRIDHARAN Aravind <ASridharan at covansys dot com> wrote:
    >>How to convert EBCDIC data into Unicode?
    >
    > There are informative mapping tables available at:
    >
    > http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/EBCDIC/

    There are also various places where IBM publishes EBCDIC<->Unicode conversion tables.
    In ICU's .ucm format, or in UTR #22 XML format, you can find them at
    http://oss.software.ibm.com/icu/charset/

    You can use ICU to perform the conversion: http://oss.software.ibm.com/icu/userguide/conversion.html

    > You need to know which EBCDIC variant (code page) you are converting
    > from. There are dozens.

    Yes - in IBM parlance, you will need to identify which CCSID is used. For some CCSIDs, there are
    multiple Unicode conversion tables, but this is less common with EBCDIC CCSIDs. ICU has an API
    function ucnv_openCCSID(): http://oss.software.ibm.com/icu/apiref/ucnv_8h.html#a55

    > They are all the same in the A-Z, a-z, and 0-9
    > ranges, but beyond that they can differ substantially.

    There are some more characters that have the same codes in most EBCDIC codepages, but there are also
    some where the Latin letters are not all present. (I think some old Japanese EBCDIC codepages
    replace small Latin letters with Katakana ones.)

    markus

    > If you don't find the mapping table you are looking for, I can probably
    > dig it up or reconstruct it.
    >
    > -Doug Ewell
    > Fullerton, California

    -- 
    Opinions expressed here may not reflect my company's positions unless otherwise noted.
    


    This archive was generated by hypermail 2.1.5 : Tue Feb 11 2003 - 12:34:54 EST