Re: UTF-8 Corrigendum, new Glossary

From: G. Adam Stanislav (
Date: Thu Nov 30 2000 - 20:05:30 EST

On Thu, Nov 30, 2000 at 10:18:07AM -0800, Markus Scherer wrote:
>you are free to write and use a non-conformant implementation. just be aware of what that means... :-)

I guess it means I'm a non-conformist. :)

I am currently working on software that translates mark-up made in one
mark-up language (Ister) and translates it into another (HTML). It
uses UTF-8, and works as CGI, i.e., generates HTML dynamically on a web
server (see for unfinished docs).

If the source (in Ister) uses illegal but decipherable UTF-8, my
software accepts it. Naturally, before it sends it out it transforms
it to perfectly legal UTF-8. The idea I should reject it is silly
(and, no, the "internal data" clause does not apply here: my software
accepts data from an external source). Rejecting it would mean
that if the web page designer used some design software that messed
up the UTF-8 encoding, the web page would suddenly miss a letter here,
a letter there. Not rejecting it poses no security risk, so, for this
specific application it is better to accept it (and correct it) than
to reject it.


Don't send me spam, I'm a vegetarian

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:15 EDT