Re: PUA

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Tue Oct 21 2003 - 02:30:19 CST


Marco Cimarosti <marco.cimarosti@essetre.it> writes:
> Now, my PuaInterpretation variable contains the following information:
>
> Foobar.ttf
>
> And my string contains the following text:
>
> 
> (U+E017 U+E009)
>
> Now, what's the next step? What am I supposed to do to find out whether,
> according to the PUA interpretation called "Foobar.ttf", U+E017 and U+E009
> are letters or not?

Effectively, I don't like the idea of tagging PUA text with "font names
tags".

I'd rather prefer tagging the PUA text with "script name tags" (I mean
the extended user-defined script codes like "x-klingon", followed by
a base codepoint indicator and a codespace length like
"x-klingon;b=E000;l=80):

- this gives a real interpretation to PUAs, evaluated in their context,

- it allows remapping them locally to other ranges in case of conflict
between
multiple PUA conventions uses

- the script indicator name can be mapped locally to a character properties
database, indexed at the relative codepoint in the PUA convention codespace.

- any number of fonts can be designed to work with PUAs even if they are
sharing conflicting codespaces.

- any language can use this system.

- no more need for extra planes

- experimentation with new scripts still not standardized is possible,
including
for character properties, breaking behavior, layout, grapheme clustering,
...

- emulation of new standardized scripts becomes possible on previous
implementations that lack support for new characters or scripts...



This archive was generated by hypermail 2.1.5 : Thu Jan 18 2007 - 15:54:24 CST