Re: ZWJ and Latin Ligatures

From: John Hudson (
Date: Thu Jul 04 2002 - 13:19:44 EDT

At 18:49 7/2/2002, Michael Everson wrote:

>>Alas, but that's technically impossible. Both OT and AAT (I'm not sure
>>about Graphite) require that single characters map to single glyphs,
>>which are then processed.
>Hm? How do you handle the decomposed sequence A + COMBINING ACUTE? Surely
>that is a sequence of characters mapping to a single glyph.

Nope. That's two characters mapped to two glyphs that might be represented by

a) character level mapping of two characters to a single character
represented by a single glyph;
b) character level mapping of two characters to a single character
represented in glyph level processing by two glyphs;
c) glyph level mapping of two glyphs to a single glyph;
d) glyph level positioning of two glyphs to form a single typeform (grapheme).

There may be other variations.

The CMAP table maps individual glyphs to one or more characters. It cannot
map sequences of characters to glyphs, or sequences of glyphs to characters.

>>(In OT, of course, you are also supposed to do some preprocessing in
>>character space, but that doesn't solve this problem.) It would be nice
>>to have a cmap format which maps multiple characters to single glyphs
>I always thought there was. Now I'm really confused as to how I would make
>a complex Indic syllable.

especially the section on Uniscribe in the 'Overview' part, which includes
a step-by-step analysis of the shaping of a Sanskrit word.

The AAT approach is, of course, a bit different, because the
character-level re-ordering takes place at the glyph level along with
everything else.

John Hudson

Tiro Typeworks
Vancouver, BC

Language must belong to the Other -- to my linguistic community
as a whole -- before it can belong to me, so that the self comes to its
unique articulation in a medium which is always at some level
indifferent to it. - Terry Eagleton

This archive was generated by hypermail 2.1.2 : Thu Jul 04 2002 - 11:42:47 EDT