Re: So how about U+D7FD for a NOP then?

From: Marcin 'Qrczak' Kowalczyk (
Date: Fri Jan 21 2005 - 17:16:16 CST

  • Next message: Mark Leisher: "The "JDGI" file grows [was re: UTF-8, BOM, 32'nd bit]"

    "Arcane Jill" <> writes:

    > Now imagine, if you will, that at some time in the future, both uses
    > of U+FEFF are deprecated. U+D7FD could then take over as the new
    > byte order marker - except that /this/ choice will cause no problems
    > for Unix.

    > Because Unix likes streams and filters, and it would be the work of
    > a moment to feed text through a filter that throws away any and all
    > occurrences of NOP.

    It's impractical to put filters in all cases programs want to read
    files. While it's easy to insert a filter into a command line, it
    makes no sense to force users to do it all the time they feed a
    program with a file. And not all programs have their inputs provided
    at the command line.

    For direct reading of files it will break all programs which process
    files as sequences of bytes, where only certain ASCII bytes are
    interesting, and all bytes with the 8th bit set are just passed around
    or cause an error, without being interpreted in detail. These programs
    don't care how many characters a byte stream represents, whether it is
    valid, or even whether it's UTF-8 or ISO-8859-2.

    In fact this is the vast majority of programs.

    The same programs are confused by a BOM.

       __("<         Marcin Kowalczyk

    This archive was generated by hypermail 2.1.5 : Fri Jan 21 2005 - 17:17:00 CST