Re: UTF-8 signature in web and email

From: John Cowan (cowan@mercury.ccil.org)
Date: Tue May 22 2001 - 07:20:31 EDT


Marco Cimarosti scripsit:

> You forget one fundamental thing about U+FEFF: it is not (only) a "byte
> order mark" or an "encoding signature": it (also) is a "ZERO WIDTH NO-BREAK
> SPACE".

Actually, this semantic seems to be going away soon, but until it does...

> I.e., it has been designed to be a white space, to not separate words, to
> not constitute a line-break opportunity and, last but not least, to be
> invisible.
>
> In other words, if correctly implemented, it is a totally non-invasive
> character: a very gentle little animal that should cause no harm to nobody.

...it is not quite true that ZWNBSP has no semantics. There is a fundamental
difference between "inactive" and "in!active" (where "!" represents ZWNBSP),
namely that at the end of a line such as this one, it is correct to show "in-
active" with hyphenation, whereas "in!active" at the end of a line must be
"inactive", with wordwrap.

-- 
John Cowan                                   cowan@ccil.org
One art/there is/no less/no more/All things/to do/with sparks/galore
	--Douglas Hofstadter



This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:18:17 EDT