Re: Towards a classification system for uses of the Private Use Area

From: Michael Everson (everson@evertype.com)
Date: Sat Apr 27 2002 - 21:40:39 EDT


At 11:39 +0100 2002-04-27, William Overington wrote:

>I am presently aware of four uses, or intended uses, of the Private Use
>Area. There may well be others, of which I am interested to learn.
>
>The four of which I am aware are as follows.
>
>The ConScript registry.

This is public enough. It assigns codes in the PUA; apparently some
use has been made of these by the Shavian community at least, before
Shavian was encoded.

>Development of a character set for Egyptian hieroglyphics.

I'm involved with this and there's been no talk about using the PUA.
This is because existing 8-bit systems comprising a superset of what
is likely to be encoded can easily be transcoded directly to Unicode
without need for testing in the PUA. Because there are existing
implementation and all that's needed is a set of (probably rich text)
mapping tables.

>Development of a character set for Cuneiform tablets.

Some use of PUA characters has I think been used by some font
developers. But it's for exchange within a very small group of
investigators.

>The eutocode system, which is part of my own research, which is mentioned in
>the DVB-MHP (Digital Video Broadcasting - Multimedia Home Platform) section
>at http://www.users.globalnet.co.uk/~ngo which is our family webspace in
>England.

"The unicode system is today a 21 bit system. Details are at the
http://www.unicode.org website."? Where is a "21-bit system"
mentioned on the Unicode website?

>Within my classification system, suppose please that someone developing the
>character codes for Egyptian hieroglyphics requests

Of whom? Of the maintainers of the "type tray" maintainers (an
analogue to John Cowan and me, for ConScript)? But why? ConScript has
a number of fun scripts in it and people might be interested in
encoding or exchanging more than one script, that's why there's a
central registry. But it seems to me that people exploring the
encoding of Egyptian and Cuneiform won't be worried about an overlap
-- apart from me, and if asked I'd just get my fellows to assign two
separate blocks to do it just in case there were a problem. For those
scripts, though (or for Blissymbols, another candidate for PUA test
implementation), the intention would be to use PUA as a very
temporary stopgap for testing.

>that he or she be assigned type tray 3001 and that someone
>developing character codes for cuneiform requests the assignment of
>type tray C001 and that I request type tray E001 for the eutocode
>system, and that all of these requests are granted.

That's more or less how ConScript functions.

>Then, in order to apply the classification system to any plain text file,
>the file needs to contain some classification characters near the start.
>
>For a file using the Egyptian hieroglyphics characters, the following
>sequence would be needed.
>
>U+F35B U+F333 U+F330 U+F330 U+F331 U+F35D

I don't understand this. Just assign Egyptian and Cuneiform to two
separate areas.

>Suppose then that one day someone comes across a plain text file and within
>that plain text file are character codes from the Private Use Area and that
>person has no idea as to which character set those character codes may be
>intended to represent.

The person wouldn't, because PUA values are agreed between sender and receiver.

>So, the person looks at that file using a word processing program and
>chooses to use a specially made fount named findpuac.ttf (that is, the find
>private use area classification fount) which has all characters as zero
>width except for the eighteen characters in the U+F3.. block which I
>mentioned in my previous posting, those eighteen characters being
>implemented in the findpuac.ttf fount as having analysis glyphs as detailed
>in my previous posting.

I think people working with Egyptian, Cuneiform, or Blissymbols will
use their own fonts for private research and can't imagine how a
central clearing house would benefit them.

>Whilst I like to hope that many people will wish to use this classification
>system, in fact I do not need the agreement of anyone in order to get the
>system started as the Unicode specification includes, in relation to the
>Private Use Area, an implied right to publish character assignments.

Indeed, this is true. You just have to do as John Cowan and I did,
and make something available.

-- 
Michael Everson *** Everson Typography *** http://www.evertype.com



This archive was generated by hypermail 2.1.2 : Sat Apr 27 2002 - 22:38:59 EDT