Re: lists of actual character/diacritic combinations

From: John Cowan (jcowan@reutershealth.com)
Date: Wed Mar 01 2000 - 12:42:26 EST


Kenneth Whistler wrote:

> The raw figures are posted below.

Thanks.

> These constitute the lumped sums from both the MUMS Books database and
> the JACKPHY database, containing 12,421,528 instances of characters with
> diacritics, out of a total of 1,492,948,727 Latin characters.

BTW, the JACKPHY database (IIRC) is bibliographic information (in Latin
alphabet transliteration) for books written in non-Latin scripts.
So it represents "non-native" uses of diacritics.

An interesting point about ANSEL is that it treats u-horn and o-horn
as unique letters like eth and ae, rather than as u and o with a
COMBINING HORN as Unicode does. Since HORN is not applied to any
other letters, I wonder why it was analyzed out by the Unicode
designers (only saved 3 codepoints).

-- 

Schlingt dreifach einen Kreis vom dies! || John Cowan <jcowan@reutershealth.com> Schliesst euer Aug vor heiliger Schau, || http://www.reutershealth.com Denn er genoss vom Honig-Tau, || http://www.ccil.org/~cowan Und trank die Milch vom Paradies. -- Coleridge (tr. Politzer)



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:59 EDT