From: Philippe Verdy (
Date: Tue Oct 21 2003 - 02:30:19 CST

Marco Cimarosti <> writes:
> Now, my PuaInterpretation variable contains the following information:
> Foobar.ttf
> And my string contains the following text:
> 
> (U+E017 U+E009)
> Now, what's the next step? What am I supposed to do to find out whether,
> according to the PUA interpretation called "Foobar.ttf", U+E017 and U+E009
> are letters or not?

Effectively, I don't like the idea of tagging PUA text with "font names

I'd rather prefer tagging the PUA text with "script name tags" (I mean
the extended user-defined script codes like "x-klingon", followed by
a base codepoint indicator and a codespace length like

- this gives a real interpretation to PUAs, evaluated in their context,

- it allows remapping them locally to other ranges in case of conflict
multiple PUA conventions uses

- the script indicator name can be mapped locally to a character properties
database, indexed at the relative codepoint in the PUA convention codespace.

- any number of fonts can be designed to work with PUAs even if they are
sharing conflicting codespaces.

- any language can use this system.

- no more need for extra planes

- experimentation with new scripts still not standardized is possible,
for character properties, breaking behavior, layout, grapheme clustering,

- emulation of new standardized scripts becomes possible on previous
implementations that lack support for new characters or scripts...

This archive was generated by hypermail 2.1.5 : Thu Jan 18 2007 - 15:54:24 CST