Re: UTF-8 'BOM'

From: Hans Aberg (
Date: Thu Jan 20 2005 - 08:14:47 CST

  • Next message: Hans Aberg: "Re: 32'nd bit & UTF-8"

    On 2005/01/20 12:36, Arcane Jill at wrote:

    > I enjoy slagging off Microsoft as much as anyone, but this is really out of
    > place here. Microsoft did not invent the BOM. Rather, they correctly
    > implemented the Unicode Standard. If the Unicode Standard were different in
    > this regard, I'm sure that MS text files would follow suit.

    Actually, other posters said it was the other way around. MS did invent
    BOM's for use with UTF-16, and decided it was practical in UTF-8. Unicode
    then followed suit, without giving though what would happen on other

    > And ... turning your reasoning around a little here ... BOM-less text files (I
    > would not be so crass as to call them "Unix text files") can just as easily
    > cause problems on Unicode Conforming platforms, because the encoding is then
    > unknown.

    That is evidently not the case on UNIX platforms, where encoding is kept
    track of of via locale, nor on say old Mac OS pre-X (MacOS X has the UNIX
    BSD at bottom), which has special resources to indicate that. There are some
    other platforms which keep track of file encodings and the like via special

    The main point is that Unicode should not in a character encoding format
    impose file format conditions. Unicode can specify special file formats, of
    course, but they should not be confused with the character encoding. This
    enable platforms to do what is best on each platform.

    > If this forum turns into a "my OS is better than your OS" war, I'm leaving.

    Even later MS OS have a Mach kernel at the bottom, which is a UNIX kernel
    extension that admits threads. and MacOS X already has UNIX BSD at the
    bottom. So while the whole world seems become more UNIX'y, Unicode goes its
    own way.

      Hans Aberg

    This archive was generated by hypermail 2.1.5 : Thu Jan 20 2005 - 08:16:41 CST