Re: UTF-8 'BOM'

From: Christopher Fynn (cfynn@gmx.net)
Date: Thu Jan 20 2005 - 07:14:25 CST

  • Next message: Lars Kristan: "UTF-8 'BOM' (was RE: Subject: Re: 32'nd bit & UTF-8)"

    Hans Aberg wrote:

    > It is much better if the BOM is illegal in UTF-8. It does not prevent MS to
    > use it, instead labelling it as a file format marker for MS text files. A
    > program that then deals with MS text files must then know about the BOM and
    > remove it when and if appropriate. At the same time, it does not cause any
    > problems for programs that normally do not handle MS text files but only
    > plain text: They are fine as they are. Everyone should be able to be happy.

    Since BOM is a valid Unicode & ISO 110646 character and UTF-8 is a
    transformation format of Unicode & 10646, if BOM were illegal in UTF-8
    it couldn't be used for *all* Unicode characters.

    - Chris



    This archive was generated by hypermail 2.1.5 : Thu Jan 20 2005 - 07:15:16 CST