RE: Conformance (was UTF, BOM, etc)

From: Lars Kristan (
Date: Sat Jan 22 2005 - 05:23:27 CST

  • Next message: Jon Hanna: "BOM in HTML (was Conformance (was UTF, BOM, etc))"

    Christopher Fynn wrote:

    > MS do still provide which runs in the Windows
    > console - and there
    > are plenty of better third party alternatives.

    Oh, they fixed, it no longer consumes 100% CPU. Very well done. Can
    it be used to edit UTF-8 text?
    Definitely not. Microsoft already went through one change. The is
    intended to edit DOS files. Those in OEM encoding. It doesn't respect
    CP_ACP. Maybe it respects CP_OEMCP, I am not sure.

    I am well aware of the availablity of other editors. The only problem is
    that less experienced (or less equipped) users are often tempted to use
    Notepad for the task of editing plain text files.

    What notepad should do is perhaps make a distinction between its .txt
    documents and plain text files. Which would be all others. It should not
    emit BOM when saving a .bat file, it should not emit a BOM when saving a
    .htm file.

    Of course they can decide to fix cmd.exe to ignore BOM. And fix it so it
    executes batch files when in CP=65000 in the first place. Consuming BOMs is
    optional, while emitting them will prove to be a pain even within Windows
    itself. Which is why I think they haven't fixed it already.

    As for the .htm, I have to admit I don't know what standards say. Frankly, I
    don't care. Whatever they say, they might be wrong. IMO, HTML files are
    plain text. Encoding issues are covered by the directives. Encoding could
    even be switched within that document. It already is. Up to the first
    directive, the encoding is ASCII. At least I would define it that way, don't
    know if it actually is. If the BOM is allowed, it should only be valid (if
    at all) up until the first directive. Opening a .htm file in text mode might
    then be a pain.


    This archive was generated by hypermail 2.1.5 : Sat Jan 22 2005 - 05:26:03 CST