On 06/22/2000 02:24:49 AM <Antoine.Leca@renault.fr> wrote:
>It was my understanding that U+FEFF when received as first character
should be
>seen as BOM and not as a character, and handled accordingly.
When the encoding scheme is known to be UTF-16BE or UTF-16LE, it *must not*
be interpreted as a BOM. When the encoding scheme is known to be UTF-16
(i.e. byte order is unknown), then it *must* be interpreted as a BOM. But
in the case of UTF-8, there is no requirement either way, and so it is
ambiguous: you don't know if it's supposed to be a BOM or ZWNBSP (unlikely
as an initial character, but valid).
Peter Constable
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:04 EDT