At 06:17 AM 5/3/2004, wrote:
>Unicode considers such combinations of letters to be "presentation forms"
>of letters which are already covered in the Unicode Standard. Although
>for the Yoruba language, the "gb" digraph is treated as a single letter,
>for computer encoding it is a string of two characters, "g" plus "b".

This is only true if:

a) there is no visual differentiation

b) and if there is visual differentiation it's not discretionary

in other words, if the digraph looks the same as the two letters next to
each other individually, we tend not to encode it separately. If the
digraph is visually distinct, then the fun starts.

In theory, if a language uses *only* that digraph, you could treat it like
a ligature and build a language dependent font. However, for Latin-based
languages, it is always necessary to be able to write other common
languages with equal facility - therefore, using a mandatory ligature seems
a poor option. That's after all why we have the Danish AE and the French OE
in the standard.

I would like to see a (small) picture of Yoruba text with these digraphs.


