Re: PUAs and other charsets

From: John Cowan (
Date: Fri Feb 25 2000 - 14:05:46 EST

Kenneth Whistler wrote:

> In principle, no mapping can be defined, since no standard characters
> are defined there,

That is perhaps debatable.

We find the term "private use characters", not simply "private
use codepoints", suggesting that E000 (for example) is an existent
character whose significance must be defined privately --- rather than
a codepoint not mapped to any character, to which private persons are
allowed to assign to their own characters. (A subtle distinction, but

> and the mappings are intended as mapping of *characters*
> to *characters*, not *code points* to *code points*.

How so? A mapping of characters would say "The character called
LATIN CAPITAL LETTER A in 8859-5 is equivalent to the character
called LATIN CAPITAL LETTER A in Unicode." Or is that what the
mapping tables are intended to mean, and the codepoints are given
solely to ease identification?

> Some systems are designed so that
> the UDC's will survive interchange between the host and the client systems,
> converting back and forth from EBCDIC PUA and the PC PUA.

And a Good Thing too. In that way, a privateer (:-)) can say
"The character whose codepoint in CPxxx is xxxx, and whose codepoint
in CPyyy is yyyy, and whose Unicode codepoint is U+Ezzz, is defined

> It would be quite a different thing for the Unicode Consortium to
> start generally suggesting particular mappings of PUA's into the
> Unicode PUA. That smacks of attempting to standardize PUA usage -- which
> the Unicode Consortium cannot and will not do.

I don't see how. It simply defines a formal and abstract equivalence
between PUCs, and says nothing about the (private) significance of any
specific PUC.


