Possibilities of future expansion (from Perception etc thread and fictional etc thread)

From: William Overington (WOverington@ngo.globalnet.co.uk)
Date: Sun Feb 25 2001 - 04:50:53 EST

In the thread "fictional scripts revisted" Kenneth Whistler wrote as


In other words, unless someone manages to wrest the standard away
from the two committees and puts up a public website with an
"Encode Your Character Here For Free and Enter Our Sweepstakes!"
interface, I'm not going to worry about "precious codespace" and
neither should anybody else.

end quote

Yet suppose that some organization were to have "Encode Your Character Here
For Free" with light moderation only and openly stated that the way that the
organization planned to make a profit were to encode all of the characters
suggested to it by visitors to its website into the hexadecimal range 200000
to 2FFFFF (with due allowance for missing out some of the characters for
compatibility with unicode over FF matters) and then sell a CD-ROM, so that
someone could access the characters using a unicode accessing program and
access the characters on the CD-ROM using a pointing mechanism from the
Private Use Area of unicode. Suppose that it were called the hypercode
project. It would take no more than 24 bits and to express such a hypercode
character in hexadecimal notation would take no more hexadecimal characters
than used for the total range of unicode.

Well, if it caught on, is it possible that some manufacturers of unicode
compatible word processors might build in a facility that could access the
hypercode CD-ROM? Such an event would take the initiative away from the
unicode consortium and the unicode consortium would have no grounds
whatsoever for reasonable complaint as the legal concept of estoppel could
be used to show statements that unicode is a closed system that does not go
beyond hexadecimal 10FFFF.

I suggest that it would be far better to act now and for the unicode
consortium to consider regarding the address spaces beyond plane 17 as
hypercode, so as not to confuse it with unicode, and to allow those that
wish to research in this area to do so within the overall guidance and under
the wing of the unicode consortium. I feel that that would be a good
balanced position between the two contrasting views and would allow balanced
and reasonable discussion of how hypercode were to be regulated and to
provide technical standards for the quality of submission to be required.
Whether the research produced very few results or expanded into a major area
of activity then that would be just how it goes for the research activity,
yet either way the results would be contained in a balanced, reasonably
debated circumstance that would find wide acceptance.

I find the Private Use Areas of great interest and a valuable resource.
However, use of the private use characters requires agreement between users
if private use characters are to be used for exchanging information between
people. Already there is a development of the ConScript registry. This has
its influence. I am researching a concept that I am hoping to call a
uniengine that uses a few more than 1024 characters. For research purposes
I am placing it in the private use area. From the unicode documentation, I
have decide to place it in the middle, using U+EC00 to U+EFFF as a block and
placing the additional character codes in U+EB00 to U+EBFF. Yet I checked
at the ConScript registry to ensure that I was not clashing with that
research work. If the uniengine concept becomes popular maybe it will
become encoded by the committees into the standard. I feel that the
interesting point though is to ask whether, just because there has been
mention in the unicode list of that range for a particular line of research
work, notes will be made of the fact in documents here and there amongst
researchers in the unicode area, so that any possibilities of clashes of
meaning with some other person's use of a particular code in those ranges is
noted. The very fact that I felt it desirable to check at the ConScript
registry is to my mind a demonstration that the private use area is already
something other than a private use area.

I also remember reading some time ago about a project called Junicode that
produced a font using some characters in the private use area. Are these
private character uses different from the uses in the ConScript registry and
likely either to cause clashes or to cause mutually exclusive interpretation
of files? I would think that a careful implementation of a hypercode
project with people able to register an exclusive code unit for a character
of their choice would be a solution to what appears to be already an
emerging problem.

I feel that a good analogy is in the implementing of a new newsgroup on the
internet. There are regular usenet newsgroups with their detailed voting
method and there are the newsgroups in the alt hierrachy where, in
principle, anyone can add a new group just like that, though there is a well
established convention that one puts forward a suggestion in the alt.config
newsgroup and in any newsgroups on similar topics to that of the proposed
new newsgroup which might be affected for discussion and a period of seven
days for discussion be allowed and perhaps a revised proposal made in the
light of discussions and in the event of no strong objections one is welcome
to start the new group. Some system managers do not accept any alt
newsgroups onto their systems, some system managers allow anything that
comes along and some only allow those alt groups that have been through the
seven day process in the alt.config newsgroup. Certainly, it might be best
to go through the voting process and get a usenet group started, yet there
are other circumstances, quite reasonable, where that is not a realistic
possibility. For example, the alt.education.distance newsgroup was started
back in 1991. Today it has many readers, most of whom found it because it
existed in the list of available newsgroups when they started looking
through what were available. There was no possibility whatsoever of having
obtained half a dozen people to join together to start the newsgroup in

Another good analogy is to compare and contrast the two intellectual
property concepts of patent and copyright. A patent is a monopoly and can
only be obtained by an application and scrutiny process. A copyright exists
when a work is created. My understanding is that some jurisdictions
require, or permit, registration of a copyright whereas others do not. I am
making an analogy that getting a unicode character accepted is more like the
patent, in that it affects the rights of others considerably, whereas
getting a hypercode character is more like getting a copyright, in that the
facility is there for each work that is produced.

Joel Rees wrote the following.


The common character set should provide the basis for expression, not simply
catalogue a huge number of semi-meaningless token partials that have been
used by a lot of people. And the big one is that I don't care how many
characters are in extension B, it is guaranteed not to be enough to write my
wife's great-grandfather's name correctly, and if we can't record family
history on computers they aren't worth the sand they're made out of.

end quote

Could you explain the matter further please? How would the name be
expressed? Would it be by one character code or many? In order that the
whole of family histories could be recorded, how many characters would be
needed, or is it open ended?

and in the thread "Perception that Unicode is 16-bit" Kenneth Whistler wrote


You don't need to build a foundation for a 72-story building underneath
a 1-story wooden frame house, even if you do live in earthquake country.

end quote

Yet, if the house foundations are being built, the huge mechanical diggers
are on site, a work team is there and the cement trucks are scheduled, and
there is a lot of spare garden area next to the house, might it be helpful
to take the opportunity to dig out a hole and put in some foundations in
case a few years down the line someone living in the house would like to add
a summerhouse, or maybe an amateur observatory, rather than wait until they
want to do it and then investigate the costs of getting a mechanical digger
hired for a special visit at that time, and indeed, try to negotiate an
access route for the mechanical digger to the property if a lot of
surrounding houses have been built in the meantime?

William Overington

25 February 2001

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:19 EDT