On Wed, Apr 24, 2002 at 09:00:17AM -0700, Doug Ewell wrote:
> The Unix and Linux world is very
> opposed to the use of BOM in plain-text files, and if they feel that way
> about UTF-8 they probably feel the same about UTF-16.
Why? The problems with a BOM in UTF-8 have to do with it being an
ASCII-compatible encoding. (I'd guess that if there are any Unixes that
use EBCDIC, the same problems would apply to UTF-EBCDIC.) Pretty much
the only reason one would use UTF-16 is to be compatible with a foreign
system, and then you use the conventions of that system.
Also, look at the output of file:
n2404r.doc: Microsoft Office document data
file.utf8: UTF-8 Unicode English text
file.utf16: Little-endian UTF-16 Unicode English character data
file_list: ASCII text
There's basically two categories here; data or text. But UTF-16 is not
considered text; it's considered data, like a Word file. Most Unix users
would treat a UTF-16 encoded file the same way; as a format to be
converted from, or edited in a word processor only.
-- David Starner - firstname.lastname@example.org "It's not a habit; it's cool; I feel alive. If you don't have it you're on the other side." - K's Choice (probably referring to the Internet)
This archive was generated by hypermail 2.1.2 : Wed Apr 24 2002 - 14:05:11 EDT