Re: EBCDIC Encoding question

From: AddisonP (AddisonP@simultrans.com)
Date: Wed Nov 04 1998 - 10:34:33 EST


IBM Host DBCS (e.g. EBCDIC conversion) is best documented as triangulated through Unicode, since the guy who maintains this at IBM has the table mapped to 10646-1, although it is a trivial matter to build a direct mapping from the tables. I have the tables around 'here' somewhere. Since I'm in Ireland this week, though, I probably won't be able to publish them to our ftp or web site this week. Mayhap mz colleague Bill Hall can get to it for me, whereupon we'll announce it here.

If someone else has this stuff lying around pre-cooked for Steven, please don't hesitate on my account.

Addison

---------------------------------
Addison P. Phillips
Director, Technology
SimulTrans, LLC

+1 (650) 526-4652 (office)
AddisonP@simultrans.com

"22 languages. One release date."
---------------------------------
-----Ursprüngliche Nachricht-----
Von: stephen_holmes@lionbridge.com <stephen_holmes@lionbridge.com>
An: Unicode List <unicode@unicode.org>
Cc: unicode@unicode.org <unicode@unicode.org>
Datum: Montag, 2. November 1998 17:12
Betreff: RE: EBCDIC Encoding question

>
>Regarding DBCS EBCDIC tables, are there any conversion tables to/from the
>Windows CP's or do I have to triangulate the conversion through Unicode?
>
>Thanks
>Steve.
>
>
>
>-----Original Message-----
>From: <unicode@unicode.org >
>Sent: 02 November 1998 06:43
>To: Unicode List <unicode@unicode.org>
>Cc: unicode@unicode.org
>Subject: Re: EBCDIC Encoding question
>
>
>Hello,
>
>am 1998-10-26 um 13:19 h hat Julia Oesterle (Unicode) geschrieben:
>> Can any EBCDIC people answer this fellows question?
>
>Though I am not one of those "EBCDIC people", I can (as the local guru on
>character encodings, and former EBCDIC user).
>
>Am 1998-10-22 um 12:31 h hat Daniel Oppenheimer geschrieben:
>> I am especially interested in converting between ASCII and EBCDIC.
>
>Note that ASCII uses 7 bits per character, whilst EBCDIC uses 8 bits.
>Hence, the mapping cannot be bijective.
>
>Note also, that ASCII is a particular 7-bit code, viz. ISO 646 IRV,
>whilst many vendors, and text-book authors, abuse the term "ASCII"
>(or the similar term "ANSI") for a pletora of different encodings:
>- MS-DOS abuses the term "ASCII" as a synonym for "text, in whatever
> 8-bit code currently is selected via the 'mode' command", usually
> one of the IBM proprietary codes, CP 437 and CP 850;
>- MS-Windows abuses the term "ANSI" for its proprietary 8-bit code,
> CP 1252 (and perhaps also for other MS propritary codes, depending
> on the current language setting),
>- many internet encoding utilities abuse the term "ASCII" for the
> 8-bit code "Latin-1" (ISO 8859-1), or its predecessor, the DEC multi-
> lingual terminal code.
>
>> However, there appears to be more than one kind of EBCDIC.
>
>Actually, there are 11 (or so) different EBCDICs for the Latin-1 character
>set, currently supported (the so-called CECPs = "Country-Extende Code Pages",
>if I am not mistaken), several other EBCDIC variants for other character
>sets, and several hundred legacy EBCDIC variants.
>
>> I am working on an encoding converter.
>
>Before embarking on any serious work concerning EBCDIC, you should obtain
>your copy of the latest "CDRA Level 1 Reference" (SC09-1390) and "CDRA
>Level 1 Registry" (SC09-1391) from your nearest IBM representative.
>
>> Could someone tell me the difference between EBCDIC 500 and open EBCDIC?
>
>What do you mean by "open EBCDIC"?
>
>You may find the following tables useful:
> <ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/EBCDIC/CP037.TXT>
> English (US) CECP, also used in Canada, Netherlands, Portugal, Brazil,
> Australia, and New Zealand
> <ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/EBCDIC/CP500.TXT>
> Belgium, Switzerland, and International CECP
> (this was meant to become "the" international CECP, but this attempt
> has failed; meanwhile, CECP 1046 is the agreed standard)
> <ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/EBCDIC/CP875.TXT>
> Greek EBCDIC
> <ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/EBCDIC/CP1026.TXT>
> Turkish EBCDIC (Latin-5 set)
> <ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1251.TXT>
> Windows code for Latin-1 countries (the "ANSI" misnomer)
> <ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/PC/CP437.TXT>
> The "classic" IBM PC code -- but see below
> <ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/PC/CP850.TXT>
> The "international" IBM PC codepage, containing (but not limited to)
> the Latin-1 character set -- but see below
> All mappings in <ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/PC/>
> are subject to the correction outlined in
> <ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/IBM/README.TXT>.
>All of these tables map various code pages to unicode, resulting in a common
>descriptive framework for thoes distinct code pages.
>
>Best wishes,
> Otto Stolz
>
>



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:42 EDT