From: Philippe Verdy (email@example.com)
Date: Sat Jul 19 2003 - 09:23:34 EDT
On Friday, July 18, 2003 10:18 PM, Michael Everson <firstname.lastname@example.org> wrote:
> I *prefer* Unicode to any subset thereof.
Why such preference? Unicode does not define the charset (which are defined by ISO10646), but character properties and related algorithms, and (in cooperation with ISO10646) their codepoint assignments.
For me, Unicode is NOT a character set, but an encoded character set, with a small but important nuance: You need to specify a version after Unicode to indicate the character set. So Unicode 4.0 is a character set, and a superset of Unicode 3.2, but Unicode alone is not.
If you just look at this definition, you cannot "prefer Unicode to any subset", because Unicode is just a name of a collection of standards and a collection of character sets and algorithms, and already is a subset of the next version... If you cannot support the idea of subsets, then don't use Unicode, or wait that the Unicode standard is definitely closed, or permanently consider that is repertoire is now closed and no more characters will be added... Of course you would be wrong.
MES-2 or its MES extension is a character set (like most legacy encodings in IANA which are also encoded character sets). In practice, nobody can live and implement any software without clearly bounded sets of characters. So versioning is absolutely necessary to fix these bounds in terms of implementation levels.
-- Philippe. Spams non tolérés: tout message non sollicité sera rapporté à vos fournisseurs de services Internet.
This archive was generated by hypermail 2.1.5 : Sat Jul 19 2003 - 12:07:07 EDT