Re: Digraphs as Distinct Logical Units

Date: Sat Aug 03 2002 - 01:38:03 EDT

On 08/02/2002 03:17:56 PM "Sean B. Palmer" wrote:

>If anyone has any comments on this, or any references to previous
>discussions, they would be gladly recieved.

Any discussion of encoding Latin digraphs as units makes an unvalidated
assumption that there is some benefit to be gained. We've gone for several
decades of English text processing never having encoded English digraphs
(th, ch, ph, wh, ff, gh, tt, ck, ou, ei, ie, ea, ee, oo, oa, etc. and
arguably a...e, e...e, i...e, o...e, u...e as well) as single characters,
and never having felt a need. We have decades of experience dealing with
implementations of Latin script, and less time dealing with
implementations of Indic scripts. But regarding these scripts with which
we have less experience, we encode some complex multi-graphs (especially
representing vowels) in scripts such as Thai as multiple character
sequences never saying there's a problem that needs encoding of digraphs
to obtain a solution. Why is it, then, that for the script for which we
have rather more experience people feel encoding of digraphs is necessary?

(Those are my thoughts, at an rate.)

- Peter

Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
E-mail: <>

