Re: Unused code positions and mapping to Unicode

From: John Cowan (cowan@locke.ccil.org)
Date: Mon Aug 16 1999 - 11:40:09 EDT


Edward Cherlin wrote:

> [sigh] Wouldn't it be nice if
>
> [1] owners always put new version numbers on code sets when changing the
> definitions,
>
> [2] software used any available version data as part of file format
> definitions

In the case of charsets which only grow by changing undefined codepoints
to defined, like Unicode post 2.0 and CP1252, there is no point in
versioning information. Old software with new data won't know how
to convert; new software with old data will work fine. Indeed,
version numbers are considered harmful in that case. Software
that believes it only understands Version 1 might choke and die
on data labelled "Version 2" even though it is, in fact, identical
to the Version 1 equivalent.

As a particular case, ISO-2022-JP uses Version 2 of one of its
constituent coded character sets, but labels it Version 1 (default).
The only difference is two extra ideographs. Using the proper ISO 2022
version-number flag would just cause old software to cough, so it is
not included.

-- 
	John Cowan	http://www.ccil.org/~cowan	cowan@ccil.org
Schlingt dreifach einen Kreis um dies! / Schliesst euer Aug vor heiliger Schau,
Denn er genoss vom Honig-Tau / Und trank die Milch vom Paradies.
			-- Coleridge / Politzer



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:51 EDT