am 1998-10-26 um 13:19 h hat Julia Oesterle (Unicode) geschrieben:
> Can any EBCDIC people answer this fellows question?
Though I am not one of those "EBCDIC people", I can (as the local guru on
character encodings, and former EBCDIC user).
Am 1998-10-22 um 12:31 h hat Daniel Oppenheimer geschrieben:
> I am especially interested in converting between ASCII and EBCDIC.
Note that ASCII uses 7 bits per character, whilst EBCDIC uses 8 bits.
Hence, the mapping cannot be bijective.
Note also, that ASCII is a particular 7-bit code, viz. ISO 646 IRV,
whilst many vendors, and text-book authors, abuse the term "ASCII"
(or the similar term "ANSI") for a pletora of different encodings:
- MS-DOS abuses the term "ASCII" as a synonym for "text, in whatever
8-bit code currently is selected via the 'mode' command", usually
one of the IBM proprietary codes, CP 437 and CP 850;
- MS-Windows abuses the term "ANSI" for its proprietary 8-bit code,
CP 1252 (and perhaps also for other MS propritary codes, depending
on the current language setting),
- many internet encoding utilities abuse the term "ASCII" for the
8-bit code "Latin-1" (ISO 8859-1), or its predecessor, the DEC multi-
lingual terminal code.
> However, there appears to be more than one kind of EBCDIC.
Actually, there are 11 (or so) different EBCDICs for the Latin-1 character
set, currently supported (the so-called CECPs = "Country-Extende Code Pages",
if I am not mistaken), several other EBCDIC variants for other character
sets, and several hundred legacy EBCDIC variants.
> I am working on an encoding converter.
Before embarking on any serious work concerning EBCDIC, you should obtain
your copy of the latest "CDRA Level 1 Reference" (SC09-1390) and "CDRA
Level 1 Registry" (SC09-1391) from your nearest IBM representative.
> Could someone tell me the difference between EBCDIC 500 and open EBCDIC?
What do you mean by "open EBCDIC"?
You may find the following tables useful:
English (US) CECP, also used in Canada, Netherlands, Portugal, Brazil,
Australia, and New Zealand
Belgium, Switzerland, and International CECP
(this was meant to become "the" international CECP, but this attempt
has failed; meanwhile, CECP 1046 is the agreed standard)
Turkish EBCDIC (Latin-5 set)
Windows code for Latin-1 countries (the "ANSI" misnomer)
The "classic" IBM PC code -- but see below
The "international" IBM PC codepage, containing (but not limited to)
the Latin-1 character set -- but see below
All mappings in <ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/PC/>
are subject to the correction outlined in
All of these tables map various code pages to unicode, resulting in a common
descriptive framework for thoes distinct code pages.
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:42 EDT