Re: Last Call: UTF-16, an encoding of ISO 10646 to Informational

From: Kenneth Whistler (kenw@sybase.com)
Date: Tue Aug 17 1999 - 15:39:08 EDT


frank wrote:

>
> Juliusz Chroboczek <jec@dcs.ed.ac.uk> wrote:
> >
> > (To clarify: I do not have any preference for one canonical order over
> > the other. Just pick one and stick to it consistently.)
> >
> There is a good authority on which one to pick:
>
> ISO/IEC 10646-1 : 1993 (E), Paragraph 6.3:
>
> "The sequence of the octets that represent a character, and the most
> significant and least significant ends of it, shall be maintained as shown
> above [in Section 6.2, illustrating Big Endian byte organization]. When
> serialised as octets, a more significant octet shall precede less significant
> octets."

Conveniently neglecting the very next sentence of clause 6.3, which
continues:

"When not serialized as octets, the order of octets may be specified
by arrangement between sender and recipient (see 16.1 and annex H)."

Where the specification of labels UTF-16BE and UTF-16LE (and UTF-16,
using FEFF as a signature, as specified in annex H of 10646) is
all about having labels we can all agree on, so that such arrangements
between sender and recipient can be made in an open and reliable
way.

--Ken

>
> - Frank
>



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:51 EDT