Re: Why is "endianness" relevant when storing data on disks but not when in memory?

From: Doug Ewell <doug_at_ewellic.org>
Date: Sun, 6 Jan 2013 17:58:38 -0700

Leif Halvard Silli wrote:

> I believe that even the U+FEFF *itself* is either UTF-32LE or
> UTF-32BE.
> Thus, there is, per se, no implication of lack of byte-order mark in
> Martin’s statement.

By definition, data in the "UTF-nBE" or "UTF-nLE" encoding scheme (for
whatever value of n) does not have a byte-order mark.

> Assuming that the label "UTF-32" is defined the
> same way as the label "UTF-16", then it is an umbrella label or a
> "macro label" (hint: macro language) which covers the two *real*
> encodings - UTF-32LE and UTF-32BE.

I've sometimes wished it were that way, that (for example) the
"UTF-32BE" and "UTF-32LE" encoding schemes were defined as variations of
"UTF-32" with special rules related to the BOM, not defined as
completely separate encoding schemes. But that's not how the definitions
are written.

The LE and BE versions are not at all "the two *real* encodings" when
there is real-world data that contains an initial U+FEFF meant to be
interpreted as a BOM or "signature."

--
Doug Ewell | Thornton, Colorado, USA
http://www.ewellic.org | @DougEwell ­ 
Received on Sun Jan 06 2013 - 19:01:21 CST

This archive was generated by hypermail 2.2.0 : Sun Jan 06 2013 - 19:01:25 CST