Re: Tags and the Private Use Area

From: Kenneth Whistler (kenw@sybase.com)
Date: Tue May 01 2001 - 17:01:23 EDT


William Overington perorated:

> Asmus continues:
>
> Going further and outlining a protocol for such a
> thing is even worse - if done by the Unicode Consortium.
> However, it would be fine for any other organization
> to define the protocol - but that organization could
> not assign any special non-private characters.
>
> end quote
>
> If the Unicode Consortium found that it could act in this matter within
> limits then definition of a protocol would be all but an essential part of
> any such action.

But precisely as Asmus has stated, the Unicode Consortium (or more
propertly the Unicode Technical Committee) would not define any
such protocol for private use characters. It is entirely up to
external organizations to engage in such work if they choose to do
so.

>
> What I have in mind here is a set of private use area support tags, perhaps
> located in plane 14, better located in plane 0 if a contiguous block of 128
> unused codes in a reasonable place not within the private use area could be
> found.

This would be a request for encoding of standardized characters, which
the Unicode Technical Committee would, of course, have to decide upon.
However, judging by my experience in the Unicode Technical Committee and
the feedback so far on this list (and the feedback received on the
language tag characters in Plane 14, which were "deprecated on birth"), it
is rather unlikely that any such proposal would be approved by the
UTC.

Among other things, you have yet to have meet the challenge by Michael
Kaplan to provide a convincing case for their requirement.

>
> There would be a protocol saying that, in a plain unicode text file, but not
> in a rich text file,

This distinction already creates a problem for your proposal. Rich text
contains chunks of plain text, and introducing a bunch of tag characters
and a protocol for using them which have to be kept out of rich text, but
which can be in plain text, creates a filtering and transducement problem.
That would introduce a problem, rather than eliminating a problem.

> certain information related to the meanings ascribed to
> any private use area codes used *may, but need not* be included using these
> tags. However, *if* that information is included using these tags, then
> this format of providing that information *must* be used.

... by a process that chooses to honor that protocol. But that would be
outside the scope of the Unicode Standard and nothing that one could depend
on a process conformant to the Unicode Standard to be following, merely by
virtue of that conformance claim.

> The protocols
> could be carefully designed so that no limiting presumption whatsoever as to
> the nature of the usages of the private use area that were capable of being
> described using the protocols were made by the protocols.
>
> I suggest that such a facility would be useful and would provide a sound
> basis for the future.

I don't think so. Frankly I don't think things would be any better than with
the kind of plain text alternatives that have already been suggested.

Or, if you are convinced that there really is sufficient reason and demand
to automate the processing, an alternative is simply to provide for
a PUAconventions.xml file, which would contain the information you are
suggesting for the protocol. Point at the appropriate PUAconventions.xml
file, and you get the equivalent of trying to bury such information in plain
text files, without actually touching the plain text files or requiring
any additions to the Unicode Standard.

>
> The alternative is either chaos or the need to use a protocol put forward by
> an organization other than the Unicode Consortium or by an individual.

Exactly.

> Yet
> such a protocol would not have had the benefit of the full procedures of the
> Unicode Consortium in its drafting and would have no standing above any
> informal agreement amongst some users that it might receive and certainly
> not the endorsement of the Unicode Consortium.

It would have the standing of whatever organization chose to standardize
such a protocol. Which is entirely appropriate.

If you were expecting such a protocol to receive worldwide acceptance and
be usable on the Internet, then you should anticipate that you would have
to get it approved by the IETF as an Internet Standard.

If, on the other hand, it were just a matter of ensuring interoperability
of private Blissymbolics implementations, then you could get endorsement
by the Blissymbolics Institute.

And so on.

>
> So, I ask a question. Is there a formal method for the matter as to whether
> the Unicode Consortium has the powers to do what I suggest above to be
> formally decided please?

See:

http://www.unicode.org/pending/proposals.html

But note that the proposals that the Unicode Technical Committee
invites are related to the encoding of *characters*, rather than
the development of higher-level protocols.

--Ken
 



This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:18:16 EDT