Re: UTF-8 ill-formed question

From: Philippe Verdy <>
Date: Sun, 16 Dec 2012 16:19:39 +0100

2012/12/16 Otto Stolz <>

> The reason I excluded the surrogates from my UTF-8 MPE
> was really that I needed additional space for the userís
> guide on the reverse side.

Why adding a row in the front side would have not preserved the space for
the reverse side ?
If this is regarded as didactic tool, addin this row would have focused
more on the validity constraint of UTF-8, enforced in TUS and now as well
in the IETF RFC made by ISO to be fully compatible with TUS.

I think that the row was missing only because your MPE was initially
designed for the old UTF-8 definition in the now obsolete ISO definition
where the validity constraint was not clear (it was not clear as well on
past variations of UTF-8 that are still existing in Java (not really for
plain-text interchange but for the 8-native JNI API compatible with 8-bit C
strings, and as part of the serialization format of compiled Java classes).

Add this missing row, Everything in the reverse side can remain the same
(or can be using a less "cryptic" compact description of how it works).
Received on Sun Dec 16 2012 - 09:24:13 CST

This archive was generated by hypermail 2.2.0 : Sun Dec 16 2012 - 09:24:14 CST