Re: Filtering and displaying untrusted UTF-8

From: Doug Ewell (
Date: Thu Dec 31 2009 - 19:25:06 CST

Petr Tomasek <tomasek at etf dot cuni dot cz> wrote:

>> * 0xE000 - 0xF900 (private use; since everyone can make up a
>> different character for a code point in private use, filter them all)
> This is very bad idea since it efectively blocks people using other
> chars that those defined in the unicode standard. (BTW, microsoft
> and others have their own PUA assignements...)

Perhaps it would help to know exactly what security hole this is
supposed to close. I don't get it either; if I want to define U+E000 as
a Tengwar letter, someone else wants to define it as a Latin ligature,
and someone else wants it to be a Masonic cipher symbol, how does
receiving U+E000 across this connection affect security?

Doug Ewell  |  Thornton, Colorado, USA  |
RFC 5645, 4645, UTN #14  |  ietf-languages @ ­

This archive was generated by hypermail 2.1.5 : Thu Dec 31 2009 - 19:26:56 CST