    "Richard T. Gillam" <> writes:

    > For that matter, applications that use the full panoply of
    > signature-byte sequences (0000FEFF for UTF-32BE, FFFE0000 to UTF-32LC,
    > FEFF for UTF-16BE, FFFE for UTF-16LE, EF BB BF for UTF-8, etc.) to
    > determine whether a byte stream is Unicode and what Unicode encoding
    > scheme it is are also implementing a higher-level protocol based on
    > Unicode.

    Strictly speaking they can't reliably distinguish UTF-32LE from UTF-16LE.

    In practice U+0000 as the first character after the marker is rare,
    so perhaps the problem can be ignored...

