Re: RE: Last Call: Language Tagging in Unicode Plain Text to Proposed

From: Rick McGowan (
Date: Fri Jul 10 1998 - 13:40:39 EDT

Qifan asked...

> I just have a small comment to the proposal which is why only including
> TAGGED version of ASCII characters in to the set? Why can not tags
> be composed of non-ASCII characters?

Maybe the paper's rationale doesn't come through so well. The entire POINT
of this thing is to make a set of "characters" that are NOT NORMAL
characters, and use those for tagging. The set is restricted to a small set
so that the entire Unicode/10646 character set does not have to be replicated
in Plane 14.

The requirement of protocols that are anticipated to use this scheme is that
tags constructed with these tag characters can be reliably distinguished
from the real text. It's only necessary to have a small set so that unique
tags can be constructed. It is NOT required to express all possible text
with these things, only to express limited tokens used for tagging schemes.

The purpose of this all is to support out-of-band tagging in Internet (and
maybe other) protocols so that you can immediately and unambiguously
distinguish TAG data from "real" data, and strip or skip the tags as

Does that make sense?


