Re: Unicode in VFAT file system

Date: Fri Jul 21 2000 - 11:26:09 EDT

>As a serialization, UTF-16 has three forms: UTF-16, UTF-16BE, and
UTF-16LE. The
>first is with (optionally) a BOM, and the others without.

I know this is what the Standard dictates, and I think I understand why,
but it doesn't make complete sense to the novice trying to find his/her

<novice attitude=pondering&confused&frustrated>
Why does it say there are three varieties when a 16-bit datum can only be
serialised in two orders? If the scheme UTF-16 doesn't have a BOM, isn't it
just one of the other two? When it does have a BOM, it can still be
serialised in two ways, so aren't there four schemes - 2 serialisations x
ħBOM? I barely manage to make sense of forms and schemes and then they
confuse me with this stuff!

Don't we really mean that there are three approved ways in which the
encoding scheme of a stream can be labelled? Wouldn't it be clearer to say
that UTF-16 has two serialisations (not forms! since were talking about
schemes), and that the encoding scheme of a stream can be labelled in one
of three ways: UTF-16, UTF-16BE and UTF-16LE?

- Peter

Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
E-mail: <>

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:06 EDT