Re: An attempt to focus the PUA discussion [long]

From: Philippe Verdy (
Date: Fri Apr 30 2004 - 22:22:32 EDT

From: "Ernest Cline" <>
> You've entirely missed the point I was trying to raise, Philippe.
> It was not the ability to do private normalization of private use
> characters that I was calling for, but making it easy to do.
> Private Variation Selectors and Private Combining Marks
> with a non-zero combining class that have appropriate
> default properties are not necessary to enable normalization
> according to a private use version of Unicode. However,
> they do make such private normalizations easier to achieve,
> as they are not dependent upon the private use agreement
> changing the defaults of properties that affect normalization.
> Besides, I don't think we want to encourage having font-based
> overrides of the properties that affect normalization. That could
> would raise all sorts of potential security holes unless such
> changes were both restricted to only the PUA and PUA
> characters were not allowed in secure applications. (The
> latter may well be good idea unless the private use that
> is desired specifically calls for such characters.)

I have not missed you point. In fact I agree with all your arguments above.

What I was saying is that PUAs will not forbid you to make all arrangements you
want. I did not say that it would still be easy: Unicode will not standardize
anything except that providing a default behavior for conforming applications, o
that it won't break applications that will try to use PUA according to their own

So for those applicatiosn that are not prepared to use PUAs of any sort, they
can be conforming by just handling them as non composable and non decomposable
symbols, with combining class 0, so that if they are used as intermediates for
the communiaction between 2 users of an unknown private convention, these
applications won't break their text containing PUAs.

For a renderer however, as there's no other communication with another user, the
communication is from the user to its screen or printer. A font will be most
often the only way for a user to instruct its renderer to use his own PUAs the
way it wants. Other inputs to the renderer could be the language tags inserted
in the plain text, or set to the renderer as options via an out-of-band API (for
example with rich-text formats). In either case, a renderer can be made so that
it will support the customizations that a user wants for his PUAs.

And no, I don't think that font-based overrides should apply out of PUAs. I
never stated that and never asked for that.

A renderer may be customized by users with language tags as I said, or through
other means, but it should not accept customizations that fall outside of PUA
ranges (unless there's a strong notice to inform the user that his
customizations are probably invalid and possibly dangerous for the security, or
the accuracy of the documents generated; some users still do that today by
hacking existing assigned standard code points, but this breaks lots of
assumptions about documents encoding accuracy).

So a font may very well include some custom character properties tables, but
this should be allowed only for PUA ranges. Or a font may just contain a simple
signature of a private PUA agreement for which the font was designed. The
signature (which can be encoded in language tags in plain-texts or in embedded
meta-data in fonts may be a URI (I mean a URL, a URN or a random UUID) refering
to definition files associated with the necessary custom properties table. (for
example a font of PUAs assigned according to the ConScript registry should
contain a URI referencing the ConScript agreement and its version and/or date.
It's up to private users or registry holders to maintain the reference data
associated with that URI signature (Unicode does not need to be involved here).

This archive was generated by hypermail 2.1.5 : Fri Apr 30 2004 - 22:57:58 EDT