Re: Converting EBCDIC to Unicode

From: Markus Scherer (markus.scherer@jtcsv.com)
Date: Tue Feb 11 2003 - 11:46:32 EST

Next message: Carl W. Brown: "RE: Converting EBCDIC to Unicode"

Previous message: Doug Ewell: "Re: Glyph of Pipeline Characters ?"
In reply to: Doug Ewell: "Re: Converting EBCDIC to Unicode"
Next in thread: Doug Ewell: "Re: Converting EBCDIC to Unicode"
Reply: Doug Ewell: "Re: Converting EBCDIC to Unicode"
Reply: Carl W. Brown: "RE: Converting EBCDIC to Unicode"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Doug Ewell wrote:
> SRIDHARAN Aravind <ASridharan at covansys dot com> wrote:
>>How to convert EBCDIC data into Unicode?
>
> There are informative mapping tables available at:
>
> http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/EBCDIC/

There are also various places where IBM publishes EBCDIC<->Unicode conversion tables.
In ICU's .ucm format, or in UTR #22 XML format, you can find them at
http://oss.software.ibm.com/icu/charset/

You can use ICU to perform the conversion: http://oss.software.ibm.com/icu/userguide/conversion.html

> You need to know which EBCDIC variant (code page) you are converting
> from. There are dozens.

Yes - in IBM parlance, you will need to identify which CCSID is used. For some CCSIDs, there are
multiple Unicode conversion tables, but this is less common with EBCDIC CCSIDs. ICU has an API
function ucnv_openCCSID(): http://oss.software.ibm.com/icu/apiref/ucnv_8h.html#a55

> They are all the same in the A-Z, a-z, and 0-9
> ranges, but beyond that they can differ substantially.

There are some more characters that have the same codes in most EBCDIC codepages, but there are also
some where the Latin letters are not all present. (I think some old Japanese EBCDIC codepages
replace small Latin letters with Katakana ones.)

markus

> If you don't find the mapping table you are looking for, I can probably
> dig it up or reconstruct it.
>
> -Doug Ewell
> Fullerton, California

-- 
Opinions expressed here may not reflect my company's positions unless otherwise noted.

Next message: Carl W. Brown: "RE: Converting EBCDIC to Unicode"
Previous message: Doug Ewell: "Re: Glyph of Pipeline Characters ?"
In reply to: Doug Ewell: "Re: Converting EBCDIC to Unicode"
Next in thread: Doug Ewell: "Re: Converting EBCDIC to Unicode"
Reply: Doug Ewell: "Re: Converting EBCDIC to Unicode"
Reply: Carl W. Brown: "RE: Converting EBCDIC to Unicode"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Tue Feb 11 2003 - 12:34:54 EST