From: Doug Ewell (dewell@adelphia.net)
Date: Wed Mar 02 2005 - 10:42:13 CST
Elliotte Harold <elharo at metalab dot unc dot edu> wrote:
>> ... Really, no opinions have ever changed that much regarding the
>> Plane 14 tags. They were born as the red-headed stepchildren of
>> Unicode; they were created only to prevent a particular protocol from
>> using a mutant form of UTF-8 for language tagging.
>
> Which protocol was that?
That would be ACAP. They needed a technique for plain-text language
tagging, which ruled out a separate markup layer of the form <span
lang="xx">...</span>.
There was an Internet-Draft for something called "Multi-Lingual String
Format" (MLSF), written by Chris Newman, that described the proposed
mechanism in exemplary detail. Basically, it used illegal UTF-8
sequences to represent language tags. It was easy to encode and decode,
as designed. It was also incompatible with any form of Unicode other
than UTF-8, and would have been rejected by existing UTF-8 processors
that did not expect this "higher layer." It very possibly would have
destroyed the stability, and consequently the widespread acceptance, of
UTF-8.
You can read the draft here:
http://xml.coverpages.org/draft-ietf-acap-mlsf-01.txt
In my 2002 paper arguing for the non-deprecation and greater acceptance
of Plane 14 tags, I attributed MSLF to "a group of CJK users" who wanted
language tags to perform Han glyph selection. That assumption was
clearly bogus.
-Doug Ewell
Fullerton, California
http://users.adelphia.net/~dewell/
This archive was generated by hypermail 2.1.5 : Wed Mar 02 2005 - 10:44:15 CST