Re: Subject: Re: 32'nd bit & UTF-8

From: Marcin 'Qrczak' Kowalczyk (qrczak@knm.org.pl)
Date: Fri Jan 21 2005 - 16:48:49 CST

  • Next message: Marcin 'Qrczak' Kowalczyk: "Re: Conformance"

    "Richard T. Gillam" <rgillam@las-inc.com> writes:

    > UTF-8 HAS NO BOM. There is nothing in the Unicode standard mandating or
    > even encouraging the use of EF BB BF at the beginning of a UTF-8 file.
    > That sequence has no special meaning in UTF-8; it's just a zero-width
    > non-breaking space.

    Let's assume that I design a programming language, specify that its
    source files should be encoded in UTF-8, don't mention anything about
    BOM, implement a compiler which happens to fail with a lexing error
    when the file begins with a BOM (U+FEFF is not whitespace: its general
    category is Cf, not Zs), and somebody complains that the compiler
    doesn't conform to the spec because it doesn't like BOM. Who is right?

    -- 
       __("<         Marcin Kowalczyk
       \__/       qrczak@knm.org.pl
        ^^     http://qrnik.knm.org.pl/~qrczak/
    


    This archive was generated by hypermail 2.1.5 : Fri Jan 21 2005 - 16:52:32 CST