Re: Generic Tagging: A Bold Proposal

From: Pete Resnick (
Date: Tue Jul 15 1997 - 18:29:13 EDT

On 7/15/97 at 4:21 PM -0500, Markus G. Kuhn wrote:

>Murray Sargent wrote on 1997-07-15 20:27 UTC:
>> I dare say that most people on the UTC agree with you and don't want to
>> put any tagging scheme into Unicode or 10646. But there are lots of
>> people out there who don't want to depend on a higher-level protocol
>> such as HTML to specify language.
>The real problem that the people who want language tagging so urgently fail
>to understand is that they actually want a vendor independent standard
>WYSIWYG file format for word processing. Some people seem to believe
>that Unicode is a word processing file format, which it is clearly not,
>just as ASCII is not.

Not all of us. Please go back and read the thread I posted in June with a
subject of "CJK Tags - Fish or cut bait". I agree with you that full
language tagging is not really necessary to put inside the text itself. The
problem is that unlike any other plain text character set (ASCII, 8859-x,
2022), where you can always figure out at least some reasonable input
method to use from the characters, Unicode does not allow you to do this
for CJK. At least on the Macintosh, though I believe this is true on other
systems that handle international text, the user at least has the option to
have the system choose a default input method based on the character set
being used, with no other external markup information. Likewise, I can
easily tell from the Unicode script ranges which input method to use
*except* for CJK.


Pete Resnick <>
QUALCOMM Incorporated
Work: (217)337-6377 / Fax: (217)337-1980

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:35 EDT