Re: BOM ambiguity?

From: Philippe Verdy <verdy_p_at_wanadoo.fr>
Date: Fri, 13 Jul 2012 23:16:44 +0200

Null characters are almost always avoided in interchanged plain texts.
This is not a practicle problem. The use of nulls as significant
characters is extremely exceptional, as they almost always require an
envelope format to specify data lengths. this envelope format is in a
file that is not plain-text by itself.

2012/7/13 Stephan Stiller <stephan.stiller_at_gmail.com>:
> As an aside to the BOM discussion - something I've always been meaning to
> ask.
>
> So there is a BOM-ambiguity when a file starts with
> FF FE
> and then a couple of U+0000 characters, yes? Because this could be either
> UTF-16 or UTF-32 under little-endianness. Has this been pointed out and
> discussed beforehand?
>
> Because the set of BOMs in different encodings don't constitute a
> prefix-free code.
>
> Stephan
>
>
Received on Fri Jul 13 2012 - 16:19:10 CDT

This archive was generated by hypermail 2.2.0 : Fri Jul 13 2012 - 16:19:10 CDT