Re: UCS-2/4 & BOM

From: Erik van der Poel (erik@vanderpoel.org)
Date: Thu Jun 02 2005 - 19:47:25 CDT

Next message: Mike Hao: "XML attribute normalization and Unicode in C language"

Previous message: Mark E. Shoulson: "Re: Ligatures fi and ffi"
In reply to: Markus Scherer: "Re: UCS-2/4 & BOM"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Markus Scherer wrote:
> The IANA character sets list
> (http://www.iana.org/assignments/character-sets) says:
>
> <quote>
> Name: ISO-10646-UCS-2
> MIBenum: 1000
> Source: the 2-octet Basic Multilingual Plane, aka Unicode
> this needs to specify network byte order: the standard
> does not specify (it is a 16-bit integer space)
> Alias: csUnicode
> </quote>
>
> I interpret this to mean that these are CEFs, not CESs or charsets.

The term "network byte order" is often used in network protocol
communities, and it means big-endian (see e.g. RFC 951 section 3). So,
another interpretation of "this needs to specify network byte order" is
that this charset registration entry still needs to be amended to
properly specify that "they" have big-endian in mind. I personally think
that this is more likely to be the intended interpretation, though I
wouldn't argue with anyone saying that the wording is unclear.

I'm Cc-ing the ietf-charsets list, in the hope that this entry might be
clarified (along with the UCS-4 entry).

Erik van der Poel

Next message: Mike Hao: "XML attribute normalization and Unicode in C language"
Previous message: Mark E. Shoulson: "Re: Ligatures fi and ffi"
In reply to: Markus Scherer: "Re: UCS-2/4 & BOM"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Thu Jun 02 2005 - 19:50:00 CDT