Re: Unicode 3.1: incomplete tags considered harmless/useful

From: DougEwell2@cs.com
Date: Thu Feb 01 2001 - 02:25:00 EST

Next message: Murray Sargent: "RE: Property error for U+2118?"
Previous message: DougEwell2@cs.com: "Fwd: Character encoding systems for Arabic Web pages ?"
Maybe in reply to: John Cowan: "Unicode 3.1: incomplete tags considered harmless/useful"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

In a message dated 2001-01-31 12:19:33 Pacific Standard Time,
jcowan@reutershealth.com writes:

> The section "Dangers of Incomplete Support" in section 13.7 seems to me
> to be far too strongly worded; it should be weakened or removed
> altogether.
>
> In particular, there is no reason why sequences of tag characters
> not beginning with LANGUAGE TAG or CANCEL TAG cannot be used
> for various purposes by private agreement. However, as currently
> worded, language-tag-interpreting applications SHOULD remove them,
> contrary to the usual Unicode view of not-understood content
> ("leave it alone").

What would be the meaning or benefit of a sequence of tag characters *not*
beginning with a tag header in the range U+E0001 through U+E001F? We are
already promised that tag characters may only be used to form valid tags, so
I don't see any benefit in allowing their use for privately defined purposes.
But clearly the restriction to U+E0001 LANGUAGE TAG and U+E007F CANCEL TAG
will be inappropriate as soon as another type of tag is defined.

> Nor is there any reason why a CANCEL TAG should be required to exist for
> every LANGUAGE TAG; in particular, a LANGUAGE TAG at the beginning
> of plain text that is meant to apply to the whole text (document,
> human-readable-string in protocols, etc.) should be unproblematic.
> As currently worded, editors SHOULD not permit such uses.

This makes sense, and in fact I was not aware of any such requirement.
Technical Report #7 specifically mentions the legitimate possibility of
language-tagged text going out of scope (i.e. hitting EOF) without a CANCEL
TAG.

-Doug Ewell
Fullerton, California

Next message: Murray Sargent: "RE: Property error for U+2118?"
Previous message: DougEwell2@cs.com: "Fwd: Character encoding systems for Arabic Web pages ?"
Maybe in reply to: John Cowan: "Unicode 3.1: incomplete tags considered harmless/useful"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:18 EDT