From: Asmus Freytag <>
Date: Sat, 20 Aug 2011 19:49:19 -0700

On 8/20/2011 6:44 PM, Doug Ewell wrote:
> Would that really be a better default? I thought the main RTL needs for the PUA would be for unencoded scripts, not for even more Arabic letters. (How many more are there anyway?)
> In any case, either 'R' or 'AL' as the Plane 16 default would be an improvement over having 'L' for the entire PUA.

The best default would be an explicit "PU" - undefined behavior in the
absence of a private agreement.

However, it helps to remember why the PUAs exist to begin with. The
demand came from East Asian character sets, which long had had such
private use areas. In their case, the issue of properties did not
seriously arise, because the vast bulk of private characters where

I bet this remains true, and so the original motivation for the
suggestion of "L" as the default would still apply - no matter how
unsatisfactory this is from a formal point of view.

If maintaining the "L" default were to fail on the cliff of political
correctness (or the "fairness" argument that has been made) the only
proper solution is to use a value of "unknown" (i.e the hypothetical PU
value) for all private use code points.

There are some properties where stability guarantees prevent adding a
new value. In that case, the documentation should point out that the
intended effect was to have a PU value, but for historical / stability
reasons, the tables contain a different entry.

Suggesting a "structure" on the private use area, by suggesting
different default properties, ipso facto makes the PUA less private.
That should be a non-starter.

Received on Sat Aug 20 2011 - 21:53:30 CDT

This archive was generated by hypermail 2.2.0 : Sat Aug 20 2011 - 21:53:32 CDT