Re: UCS-2/4 & BOM

From: Markus Scherer (markus.icu@gmail.com)
Date: Thu Jun 02 2005 - 16:24:23 CDT

Next message: Theo Veenker: "Re: JIS X 0208 mappings in Unihan.txt"

Previous message: John H. Jenkins: "Re: JIS X 0208 mappings in Unihan.txt"
In reply to: Theo Veenker: "UCS-2/4 & BOM"
Next in thread: Erik van der Poel: "Re: UCS-2/4 & BOM"
Reply: Erik van der Poel: "Re: UCS-2/4 & BOM"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

The IANA character sets list
(http://www.iana.org/assignments/character-sets) says:

<quote>
Name: ISO-10646-UCS-2
MIBenum: 1000
Source: the 2-octet Basic Multilingual Plane, aka Unicode
this needs to specify network byte order: the standard
does not specify (it is a 16-bit integer space)
Alias: csUnicode

Name: ISO-10646-UCS-4
MIBenum: 1001
Source: the full code space. (same comment about byte order,
these are 31-bit numbers.
Alias: csUCS4
</quote>

I interpret this to mean that these are CEFs, not CESs or charsets.
They would not be the only items in the charsets list that are not
charsets.

In practice, if you do see them specified, you might want to check if
the sender is sending what looks like a BOM. In other words, it may be
best to reinterpret them as "UTF-16" and "UTF-32" charsets.

Or, reject the text with an error. It's the sender's fault to use
these names :-)

On 6/2/05, Theo Veenker <Theo.Veenker@let.uu.nl> wrote:
> If someone sends me a text file marked charset=ISO-10646-UCS-2
> or charset=ISO-10646-UCS-4, should an initial BOM in this file have
> the same meaning as a BOM in UTF-16/32?

markus

-- 
Opinions expressed here may not reflect my company's positions unless
otherwise noted.

Next message: Theo Veenker: "Re: JIS X 0208 mappings in Unihan.txt"
Previous message: John H. Jenkins: "Re: JIS X 0208 mappings in Unihan.txt"
In reply to: Theo Veenker: "UCS-2/4 & BOM"
Next in thread: Erik van der Poel: "Re: UCS-2/4 & BOM"
Reply: Erik van der Poel: "Re: UCS-2/4 & BOM"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Thu Jun 02 2005 - 16:25:31 CDT