Re: Subject: Re: 32'nd bit & UTF-8

From: Hans Aberg (
Date: Thu Jan 20 2005 - 12:16:28 CST

  • Next message: Hans Aberg: "Re: UTF-8 'BOM' (was RE: Subject: Re: 32'nd bit & UTF-8)"

    On 2005/01/20 14:44, Philippe Verdy at wrote:

    > Treating a leading BOM in files does not require a stateful scanner. In
    > fact when you use files, you already have to handle a required state:
    > whever the file exists, can be opened, is opened, is flushed or
    > resynchronized in buffers, and is closed.

    In the old days, it was more common that files contained end markers and
    such. But the movement has been away from that, letting the file itself only
    contain binary data. Other data can be put in other files, at need. On
    simpler OS's, one should not have too many files. Under UNIX, the situation
    is the opposite: It is designed to be able handle a lot of small files. And,
    anyway, it is not up to Unicode to impose conditions on how the OS should
    implement its file handling.

      Hans Aberg

    This archive was generated by hypermail 2.1.5 : Thu Jan 20 2005 - 12:18:34 CST