Re: Subject: Re: 32'nd bit & UTF-8

From: Hans Aberg (haberg@math.su.se)
Date: Tue Jan 18 2005 - 18:09:33 CST

  • Next message: Hans Aberg: "Re: 32'nd bit & UTF-8"

    On 2005/01/18 22:58, D. Starner at shalesller@writeme.com wrote:

    > "Jon Hanna" writes:
    >
    >>> In , the use of BOM is
    >>> discouraged for use on UNIX platforms. So if endianness may appear to
    >>> becomes a problem, it might be better to use UTF-8 externally, and then
    >>> convert it to UTF-32/H/L internally in the program.
    >>
    >> Discouraged or not, it's in the standard, you have to support it.
    >
    > That may or may not be true in a standards conformance sense, but
    > it's defintely not true in the real world. UTF-8 is a minimal
    > change for easy conversion in the Unix world, and nobody is going
    > to change the low-level tools to recognize the BOM, especially the
    > ones that are used for byte-streams as well as text. Stuff that
    > supports a dozen different formats may as well support UTF-8 BOMs,
    > but a lot of stuff doesn't and won't.

    UTF-8 BOM's seem pointless.

      Hans Aberg



    This archive was generated by hypermail 2.1.5 : Tue Jan 18 2005 - 18:13:12 CST