From: Peter Kirk (peterkirk@qaya.org)
Date: Fri Jan 21 2005 - 09:52:11 CST
On 21/01/2005 14:56, Arcane Jill wrote:
> What with all the BOM difficulties, and the fact that U+FEFF doubles 
> up as ZERO WIDTH NO-BREAK SPACE, a new possibility occured to me.
>
> Imagine if the codepoint U+D7FD were reserved as NOP, having 
> properties which essentially made it completely ignorable and 
> invisible. It could simply be thrown away, whereever it were encounted.
>
Interesting idea, Jill. But would it not be easier simply to redefine 
the properties of U+FEFF so that it is effectively a NOP? I know the 
name cannot be changed, but I think the relevant properties can be. This 
would of course affect a small number of existing texts which make use 
of the non-breaking properties of U+FEFF and have not switched to the 
preferred WORD JOINER. But there are precedents for such changes in 
properties which break deprecated uses of characters.
The great advantage of this is that it requires no changes to current 
software for recognising and converting between encoding schemes.
This does not actually affect the fact that there is a distinction 
between the BOM signature and the encoded representation of the 
character U+FEFF, it just means that a failure to make the distinction 
has no practical effect. Note also that processes would not be allowed 
to delete or insert U+FEFF or any other NOP if this is actually a 
character. Or perhaps they could if a new NOP character could be made 
canonically equivalent to the null string, would that be possible?
Of course it would be possible to redefine the encoding schemes such 
that the encoded representation of U+D7FD is not interpreted as a 
character at all but is discarded on decoding. But this is something 
different, and more disruptive.
-- Peter Kirk peter@qaya.org (personal) peterkirk@qaya.org (work) http://www.qaya.org/ -- No virus found in this outgoing message. Checked by AVG Anti-Virus. Version: 7.0.300 / Virus Database: 265.7.1 - Release Date: 19/01/2005
This archive was generated by hypermail 2.1.5 : Fri Jan 21 2005 - 10:35:27 CST