Re: 32'nd bit & UTF-8

From: Arcane Jill (
Date: Thu Jan 20 2005 - 09:15:09 CST

  • Next message: Lars Kristan: "RE: Subject: Re: 32'nd bit & UTF-8"

    -----Original Message-----
    From: []On
    Behalf Of Hans Aberg
    Sent: 20 January 2005 14:14
    To: Peter Kirk
    Cc: 'Unicode'
    Subject: Re: Subject: Re: 32'nd bit & UTF-8

    > As a standard, Unicode will have to fight for recognition.

    Mebbe, but it doesn't have a lot of competition. As a girl, I used to imagine
    that one day there would be one single super-ASCII character set, with all the
    characters in the world in it. (I was a sad kid). Now we have one, and it
    doesn't seem to have any rivals. So what's the choice? Unicode versus Latin1?
    I'll take the one with a gazillion characters in it, thank you very much.

    > Just as I, and others will, oppose the UTF-8 BOM requirement for good
    > reasons.

    Are we all clear about what the BOM requirement actually /is/, by the way?
    Unicode does NOT require that all UTF-8 text files must begin with a BOM; it
    only requires that conformant processes can recognize and handle the BOM
    character /if/ it should be found.

    > You are drawing this analogue too far, because it is fairly easy to fix the
    > \r\n problem, whereas the BOM problem runs deeper. The latter changes the
    > very paradigm for file representation.

    I don't see why. What is the difference between discarding U+000Ds and
    discarding U+FEFFs ?


    This archive was generated by hypermail 2.1.5 : Thu Jan 20 2005 - 09:16:14 CST