Re: Unicode and Kermit

From: John Cowan (cowan@locke.ccil.org)
Date: Wed Aug 11 1999 - 10:37:56 EDT


Mark Davis wrote:
>
> > Always writing a BOM is a safe choice, because a BOM is semantically
> > zero-width no-break space, which is essentially a no-op.
> >
>
> This is not quite true: BOM is not quite a NO-OP; it does need to be removed
> from a file. For example, f I split a file into two, then concatenate, the
> result should be identical to the original--it isn't unless I remove the BOM.

True. But what effect does the extra ZWNBSP have in such a case?
Nearly none: the character is zero-width, does not affect breaking,
etc. (If the file was broken between a base character and its
combining character(s), then you may have a problem.)

-- 
	John Cowan	http://www.ccil.org/~cowan	cowan@ccil.org
Schlingt dreifach einen Kreis um dies! / Schliesst euer Aug vor heiliger Schau,
Denn er genoss vom Honig-Tau / Und trank die Milch vom Paradies.
			-- Coleridge / Politzer



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:50 EDT