At 11:41 AM 7/20/00 -0800, Ken Krugler wrote:
>No. UCS-2 and UCS-4 have always been bigendian. Read ISO 10646-1:1993,
>section "6.3 Octet order" (page 7):
> When serialized as octets, a more significant octet shall
> precede less significant octets.
The section continues: "When not serialized as octets the order of octets
may be specified by an agreement between sender and recipient (see claus
17.1 and Annnex F )"
Annex F introduces the BOM.
On the face of it the two parts of clause 6.3 seem to be a bit
self-contradictory and could possibly stand some editorial clarification,
but on the whole, even ISO/IEC 10646 recognizes that other byte orders
exist and suggests means (in Annex F) how sender and recipient might
communicate this fact.
Since the time of writing for this clause (1991), both the amount of data
in the various byte order, and practical experience with Unicode has
increased dramatically and the full discussion is available in
http://www.unicode.org/unicode/reports/tr17 Character Encoding Model
as well as the relevant sections of The Unicode Standard, Version 3.0
Note that there is no such thing as UCS-2LE or UCS-2BE. These terms are not
defined anywhere, but UTF-16LE and UTF-16BE are. Unicode has adopted the
philosophy that indications of subsets (e.g. surrogate-accessible
characters supported or not) is not something that belongs in the
designation of the encoding form.
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:06 EDT