Re: Tags and the Private Use Area

From: William Overington (WOverington@ngo.globalnet.co.uk)
Date: Tue May 01 2001 - 14:53:10 EDT


Asmus Freytag wrote:

This would violate the neutrality that the Unicode
Consortium is bound to observe when it comes to
uses of the Private Use Area. By encoding characters
it would implicitly endorse the scheme (or series of
schemes) designed to use these characters.

end quote

I have read the posting several times. The first time through I did not
agree with what was written. When I read it again later, I found that I
then believed that the posting is absolutely correct, that my suggestion was
incorrect and that I had a deeper understanding of the issues involved. A
third reading put me back to disagreeing! Thus, for me, from my reading of
it, the situation seems very finely balanced. I mention this because I
would like, with permission, to comment on what is written in the posting
without necessarily either agreeing or disagreeing in total.

My first comment is that I am now by no means certain that my suggestion as
to what I felt that the Unicode Consortium could reasonably do in this
matter is valid. Yet I am not quite sure that the Unicode Consortium could
not act if it so chose, within limits.

Asmus continues:

Since such scheme(s) support only some particular
usage (or set of usages) of the private use area,
the consortium would no longer be neutral towards
*any and all* uses of the Private Use Area.

end quote

This is the core sentence of the posting for me. The question is as
follows.

Does such a scheme support only some particular usage (or set of usages) of
the private use area?

I find the phrase "or set of usages" particularly informative.

Let us consider the set of all possible usages.

Can there be found a possible usage that such a scheme would not support?
Finding just one would resolve the question.

However, since the "not finding" of one would be no proof that such a scheme
could not exist, can it be proved mathematically that there is no possible
usage that such a scheme would not support? For, if it could be so proved,
then, unless there are also other reasons, the Unicode Consortium might
indeed *have* the power to act if it so chooses. This would mean that
interested people might be able to develop a system within the private use
area with the prospect of it being moved from the private use area and
promoted to the status of being a part of the unicode standard if it were
found useful.

Asmus continues:

Going further and outlining a protocol for such a
thing is even worse - if done by the Unicode Consortium.
However, it would be fine for any other organization
to define the protocol - but that organization could
not assign any special non-private characters.

end quote

If the Unicode Consortium found that it could act in this matter within
limits then definition of a protocol would be all but an essential part of
any such action.

What I have in mind here is a set of private use area support tags, perhaps
located in plane 14, better located in plane 0 if a contiguous block of 128
unused codes in a reasonable place not within the private use area could be
found.

There would be a protocol saying that, in a plain unicode text file, but not
in a rich text file, certain information related to the meanings ascribed to
any private use area codes used *may, but need not* be included using these
tags. However, *if* that information is included using these tags, then
this format of providing that information *must* be used. The protocols
could be carefully designed so that no limiting presumption whatsoever as to
the nature of the usages of the private use area that were capable of being
described using the protocols were made by the protocols.

It is a matter for consideration as to quite how much information could be
included in such protocols without making any limiting presumption as to
usage of the private use area, but it might well be possible to provide an
amount sufficient to avoid ambiguity and to allow two or more overlapping
uses of the private use areas to be used in different parts of the same
document. At the very least, if an ordinary language comment could be added
that would be helpful. If a Uniform Resource Locator could be added as an
optional element, then good.

I feel that such protocols permitting the optional use of a font name if the
private use area codes so described are to be regarded as displayable
characters using a particular font does not violate neutrality as to whether
any particular defined use of a private use area character so defined by any
particular member of the unicode user community is to be a displayable
character or a non-displayable character. It is well known that some uses
of the private use area can be for displayable characters and some for
non-displayable characters. The unicode specification recognizes this in
the specification, so providing facilities such that any particular member
of the unicode user community may inform people as to what he or she has
chosen to do in any particular circumstance surely cannot be wrong.

A software system seeking to make sense of a plain unicode text file could
then use some or all of any information provided by private use area tags as
it chose, perhaps, according to the settings in its own prefences section,
displaying any comments in a dialog box for the user to view. Such
preferences might allow that any URL for the font to use that is received
could be either acted upon as a URL automatically, or the file name at the
end of the URL could be automatically extracted and searched for as a local
file in the fonts directory without accessing the internet, or the matter
could be reported in a dialogue box and the user invited to make a decision
as to what to do. These suggestions for what such a software system might
do are just suggestions as to what a software manufacturer might possibly
choose to do, I am not suggesting that they be included in formal protocols,
except perhaps in a note giving ideas as to possible ways that a software
system might make use of the information provided by private use area tags.

I suggest that such a facility would be useful and would provide a sound
basis for the future.

The alternative is either chaos or the need to use a protocol put forward by
an organization other than the Unicode Consortium or by an individual. Yet
such a protocol would not have had the benefit of the full procedures of the
Unicode Consortium in its drafting and would have no standing above any
informal agreement amongst some users that it might receive and certainly
not the endorsement of the Unicode Consortium. That would not be a very
good situation at all, made worse in some ways if members of the unicode
user community generally happened to think that the protocol, though perhaps
not the best that might have been achieved, nevertheless did exist and was
far better than nothing and so was widely acknowledged as being an
appropriate *quasistandard* to use, because how would new users of the
unicode system get to know about it - presumably not from a mention of its
existence in the unicode standard?

So, I ask a question. Is there a formal method for the matter as to whether
the Unicode Consortium has the powers to do what I suggest above to be
formally decided please?

Asmus continues:

I do believe we are going in circles here, and
lengthy ones to boot.

end quote

Possibly. Yet I feel that your very precise explanation in your posting
has, at least for me, added valuable additional information to the
discussion.

William Overington

1 May 2001



This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:18:16 EDT