Date: Mon May 03 2004 - 07:20:54 CDT

Philippe Verdy wrote,

> From: "D. Starner" <>
> > Unicode will not allocate any more codes for characters that can be made
> > precomposed, as it would disrupt normalization.
> But what about characters that may theorically be composed with combining
> sequences, but almost always fail to be represented successfully?


> If such ligature has a distinct semantic from a ligature created by ligaturing
> separate letters for presentation purpose, the character is not a ligature (the
> AE and OE "ligated glyphs" are distinct abstract characters) .

The "gb" combination mentioned in the original post is considered a letter
in the Yoruba alphabet. It is not a ligature, it is a digraph. Likewise,
in the Spanish alphabet, the "ll" combination is considered a letter. It
is also a digraph. Both of these combinations are already handled by ASCII.

(Note that the AE and OE "ligated glyphs" *are* ligatures.)

> The case of dot below however should be handled in fonts by proper glyph
> positioning and probably not by new assigned codepoints, unless this is only one
> possible presentation form for an actual distinct abstract character that may
> have other forms without this separate diacritic (for example if g with dot
> below was only one presentation for an abstract character that may be also
> renderd with a small gamma)....

Yoruba doesn't use any marks with the letter "g". It does use some diacritics
like acute, grave, and macron to indicate tones. It also uses a mark below
the letters "e", "o", and "s" which alter the pronunciation of those letters.
This is where there remains some controversy. One faction prefers the use
of a vertical line below which should attach to the base letter, and the
other faction prefers to use the dot below.

Best regards,

James Kass

