Re: PUA

From: Doug Ewell (dewell@adelphia.net)
Date: Mon Oct 20 2003 - 00:18:56 CST


Chris Jacobs <chris dot jacobs at freeler dot nl> wrote:

> As I understand the position of the designers of Unicode they
> definitely don't want to be in charge of this and want to let the
> users of the PUA fight it out among themselves.

"Come to a mutual agreement" is probably more in the spirit. I doubt
the original designers of Unicode expected much competition among PUA
mappings.

> Nevertheless I think if Unicode don't want to decide how the PUA is
> to be interpreted it should be at the very least provide a mechanism
> by which an user of the PUA can specify which specification he
> prefers.

I'm pretty sure UTC wants to stay as far away as possible from something
like this that could be misunderstood as running a PUA registry.

> I plan to propose such a mechanism:
>
> I want to propose a char with the following properties:
>
> Scalar Value: U+E0002
>
> This starts a PUA interpretation selector tag.
> The content of the tag is a Font family name.
> For all PUA chars between this tag and the corresponding Cancel tag
> the copyright holder of the font is the sole authority about how the
> PUA should be interpreted.
>
> Any comments?

Plenty. You're assuming a one-to-one relationship between font and PUA
mapping, and especially between font maker and PUA registration
authority, that doesn't necessarily exist. Code2000, for instance, is
not the only font that covers some of the ConScript ranges, particularly
Tengwar and Klingon. For the PUA mappings established by Microsoft and
Apple, there are numerous fonts distributed not only by those companies,
but by others.

Ideally, PUA characters should also have complete (or nearly complete)
information on Unicode properties, such as directionality and combining
class. This isn't necessarily the kind of information you could get by
asking the font vendor or examining a font file. Font files don't even
have Unicode character names, just short identifiers like "aacute."

Despite the wording "For all PUA chars...", there is no real guarantee
that an implementation would respect this font tag for PUA characters
only, and I think there'd have to be.

Finally, there is not a great sentiment within the UTC for expanding the
role of Plane 14 tags in general. In my November 2002 paper "In defense
of Plane 14 language tags" (L2/02-396R), I wrote that deprecating those
tags (which was under discussion at the time) would implicitly deprecate
the entire concept of Plane 14 tagging, and discourage the introduction
of new, non-language-related Plane 14 tags like the one you describe.
As it turns out, there are those who feel that would be a good thing.

-Doug Ewell
 Fullerton, California
 http://users.adelphia.net/~dewell/



This archive was generated by hypermail 2.1.5 : Thu Jan 18 2007 - 15:54:24 CST