From: Doug Ewell (dewell@adelphia.net)
Date: Sun Feb 04 2007 - 14:18:42 CST
Philippe Verdy <verdy underscore p at wanadoo dot fr> wrote:
> I wanted to signal this because I noted that what was a technical note
> is now displayed as a draft for becoming a UTS (with differences
> emphasized, notably for the conformance requirement). I had read this
> doc a long time ago, and there was no such "draft" status so it was
> not really a problem. Also the licencing issue is still not resolved.
> So the final drat should be listed in the Public Review page to fix
> the exact wording.
>
> The main risk is caused by the ambiguity of the sentence which does
> not indicate that it really encodes the codepoint U+FEFF normally
> (i.e. it changes the current state), and that does not specify if the
> leading BOM is required or optional.
I'm sure it would not be difficult to edit Section 2.5 to explain this,
something like:
"An initial U+FEFF is encoded in BOCU-1 with the three bytes FB EE 28.
Note that adding or stripping an initial U+FEFF generally requires the
next code point above U+0020 to be re-encoded."
> If encoding the reset byte FF is not recommended, then the leading BOM
> should not be recommended either, because this is a concatenation of
> an unrelated substring to the text. that's where i think that, in that
> case, the BOM, if used, should better be followed by a RESET byte,
> even if the rest of the document does not use any RESET byte.
FF resets can also improve compression, particularly when a character
beyond U+2980 is followed by a Basic Latin character. If I were a legal
stakeholder in the BOCU project, I would have taken the italicized
passage in Section 2.4:
"Using FF to reset the state breaks the ordering and the deterministic
encoding! The use of FF resets is discouraged."
and added:
"... in applications where these features are more important than
optimum compression."
To me these are all implementation details and can be easily worked out,
whereas the patent encumbrance is a showstopper.
-- Doug Ewell * Fullerton, California, USA * RFC 4645 * UTN #14 http://users.adelphia.net/~dewell/ http://www1.ietf.org/html.charters/ltru-charter.html http://www.alvestrand.no/mailman/listinfo/ietf-languages
This archive was generated by hypermail 2.1.5 : Sun Feb 04 2007 - 14:21:05 CST