Date: Mon Oct 20 2003 - 19:41:48 CST

Marco Cimarosti wrote,

> So far so good. Now I want to use your PUA Plan-14 tags, if present, to
> override the above assumption about PUA characters. E.g., imagine that my
> string contains this:
> 󠀀󠀂󠁆󠁯󠁏󠁢󠁡󠁲󠀮󠁴󠁴󠁦󠁿> ?
> (U+0E0000 U+0E0002 U+0E0046 U+0E006F U+0E004F U+0E0062 U+0E0061
> U+0E0072 U+0E002E U+0E0074 U+0E0074 U+0E0066 U+0E007F U+E017 U+E009)
> This is what I am going to do:
> 1) I parsing the tags at the beginning of the string and save the relevant
> information in a temporary variable which we will call PuaInterpretation;
> 2) I remove the tags.
> Now, my PuaInterpretation variable contains the following information:
> Foobar.ttf
> And my string contains the following text:
> 
> (U+E017 U+E009)
> Now, what's the next step? What am I supposed to do to find out whether,
> according to the PUA interpretation called "Foobar.ttf", U+E017 and U+E009
> are letters or not?

Hmmm, the UTF-8 non-BMP string apparently got munged.

Anyway, the next step is for your function to load the file

This file is a plain-text file following the same format as UNIDATA. It's
extensible -- if the font vendor doesn't include it with the font download,
then the savvy end-user can simply construct it with a plain-text editor.

Now your function has all the necessary information and can determine
whether the PUA code points are letters, or not.

Best regards,

James Kass

This archive was generated by hypermail 2.1.5 : Thu Jan 18 2007 - 15:54:24 CST