Re: 32'nd bit & UTF-8

From: Hans Aberg (haberg@math.su.se)
Date: Thu Jan 20 2005 - 14:46:57 CST

Next message: Hans Aberg: "Re: Subject: Re: 32'nd bit & UTF-8"

Previous message: Hans Aberg: "Re: 32'nd bit & UTF-8"
In reply to: Arcane Jill: "Re: 32'nd bit & UTF-8"
Next in thread: Philippe VERDY: "Re: Re: 32'nd bit & UTF-8"
Maybe reply: Philippe VERDY: "Re: Re: 32'nd bit & UTF-8"
Maybe reply: Philippe VERDY: "Re: Re: 32'nd bit & UTF-8"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

On 2005/01/20 16:15, Arcane Jill at arcanejill@ramonsky.com wrote:

>> As a standard, Unicode will have to fight for recognition.
>
> Mebbe, but it doesn't have a lot of competition.

It is dangeorus for Unicode to assume that it does not have any competion,
and arrogantly ignore the issues that the users put forth. UNIX'es will
strip out the BOM anyway, the seems clear, becuse it deos not fit into thier
file and streams model.

>> Just as I, and others will, oppose the UTF-8 BOM requirement for good
>> reasons.
>
> Are we all clear about what the BOM requirement actually /is/, by the way?
> Unicode does NOT require that all UTF-8 text files must begin with a BOM; it
> only requires that conformant processes can recognize and handle the BOM
> character /if/ it should be found.

So UNIX processes are not, and will not be, Unicode UTF-8 process conformant
as long as the BOM requirement remains.

>> You are drawing this analogue too far, because it is fairly easy to fix the
>> \r\n problem, whereas the BOM problem runs deeper. The latter changes the
>> very paradigm for file representation.
>
> I don't see why. What is the difference between discarding U+000Ds and
> discarding U+FEFFs ?

This has widely discussed in other posts. In fine, it runs a great deeper
into the UNIX OS. See the posts by Marcin 'Qrczak' Kowalczyk, or
<http://www.cl.cam.ac.uk/~mgk25/unicode.html>.

Hans Aberg

Next message: Hans Aberg: "Re: Subject: Re: 32'nd bit & UTF-8"
Previous message: Hans Aberg: "Re: 32'nd bit & UTF-8"
In reply to: Arcane Jill: "Re: 32'nd bit & UTF-8"
Next in thread: Philippe VERDY: "Re: Re: 32'nd bit & UTF-8"
Maybe reply: Philippe VERDY: "Re: Re: 32'nd bit & UTF-8"
Maybe reply: Philippe VERDY: "Re: Re: 32'nd bit & UTF-8"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Thu Jan 20 2005 - 14:50:19 CST