Re: Sun's Java encodings vs IANA's character set registry

From: Keld Jørn Simonsen ([email protected])
Date: Fri Apr 13 2001 - 18:10:58 EDT

Next message: Keld Jørn Simonsen: "Re: Unicode Collation Algorithm"
Previous message: Tex Texin: "[Fwd: Re: benefits of unicode]"
In reply to: Markus Scherer: "Re: Sun's Java encodings vs IANA's character set registry"
Next in thread: Yves Arrouye: "RE: Sun's Java encodings vs IANA's character set registry"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

On Fri, Apr 13, 2001 at 11:32:16AM -0700, Markus Scherer wrote:
> It looks to me like the "Cp" names might be IBM CCSIDs. For those, have a look at the "ibm-" names in ICU's alias table at http://oss.software.ibm.com/cvs/icu/~checkout~/icu/data/convrtrs.txt
>
> Note that ICU uses "cp" to mean Microsoft codepage numbers.
>
> Note also that even IBM changes some of its tables over time and has in a few dozen cases multiple Unicode<->codepage tables per CCSID (see our entries for ibm-943 and ibm-1363).
>
> "Haphazard" is a good description of the situation...
> It is easy to have "repertoires" - the hard part is to have "one repertoire". The situation is beyond repair, although we (ICU) are still collecting and publishing data. Use Unicode, UTFs, SCSU.
>
> markus
>
> Mike Brown wrote:
> ...
> > I should not be surprised by your statement, but I am. It is distressing to
> > think that something that by definition should not be rocket science --
> > repertoires of abstract characters mapped directly to specific bit patterns
> > -- would be subject to such haphazard definition and even more haphazard
> > implementation.

The ISO charmap registry has unique naming of encodings, taht does not
change, and that is aligned with the IANA registry, See http://www.dkuug.dk/cultreg

Keld

Next message: Keld Jørn Simonsen: "Re: Unicode Collation Algorithm"
Previous message: Tex Texin: "[Fwd: Re: benefits of unicode]"
In reply to: Markus Scherer: "Re: Sun's Java encodings vs IANA's character set registry"
Next in thread: Yves Arrouye: "RE: Sun's Java encodings vs IANA's character set registry"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:17:16 EDT