Re: short Unicode names?

From: Sairus P. Patel (sppatel@Adobe.COM)
Date: Tue Jan 06 1998 - 18:19:17 EST


Werner:

Here's the URL for Adobe's "Unicode and Glyph Names" document:

 http://www.adobe.com/supportservice/devrelations/typeforum/unicodegn.html

for your reference, even though it won't fit your particular needs, because:

1. It specifies an extensible glyph-naming mechanism ("uni<CODE>" e.g.
   "uniBB75") for all Unicode characters not in the 1000-odd-entry Adobe
   Glyph List. You say you want descriptive, "human-readable" names for all
   entries in your database.

2. Due to backward-compatibility issues, several glyph names are not
   particularly appropriate; further, some are mapped to more than one
   Unicode value. You say you aren't limited by compatibility issues.

John Clews' algorithm seems to be the sort of thing you're looking for,
though it might break down with some of Unicode's longer names. Just for
fun, I tried it on U+FBF9, "ARABIC LIGATURE UIGHUR KIRGHIZ YEH WITH HAMZA
ABOVE WITH ALEF MAKSURA ISOLATED FORM", and it produced:

   Ar_liga_uigh_kirg_yeh_hamz-a_alef_isol_form

which is 43 characters long, 11 over your specified limit! (I do believe
John implied that this algorithm has been used only with Latin and Cyrillic
characters.) If the underlines were removed, as suggested in the algorithm,
when space is at a premium, a 35-character word is produced:

   Arligauighkirgyehhamz-aalefisolform

I'm not familiar with TeX's glyph naming mechanism, or how the user
interfaces with it, but are you sure you want to have short descriptive
names for *all* Unicode values?

Sairus



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:38 EDT