Re: Some Char. to Glyph Statistics, Pan/Single Font

From: Jungshik Shin (jshin@mailaps.org)
Date: Wed May 30 2001 - 15:46:04 EDT


On Wed, 30 May 2001, James E. Agenbroad wrote:

  Thank you for interesting piece of information.

> Wednesday, May 30, 2001
> Attached is a note I wrote in September 1993 about the ratio of characters
> to glyphs in several Indic scripts. Much has changed on the Unicode
> front since then, but I think the need for rendering software to decide

> character to glyph ratio; Chinese, Japanese and (maybe Korean) scripts
> also tend to have a 1:1 character to glyph ratio. But most scripts

  In case of Korean Hangul, your 'maybe' can be justified because
the situation is not so simple. If you only consider pre-composed syllable
block beg. at U+AC00 and have fonts with pre-composed glyphs for all
of those syllables, it could be 1:1. However, if you turn your eyes
to U1100 Hangul Consonant/Vowel block and want to have a full-fledged
support of medivial Korean, the ratio can be anybody's guess from 1:1
(poor quality,unconventional shape) to 1:n to m to n (where n can be
a few tens if not more). In 1980's, typical MS-DOS based programs(or
Hangul rendering libraries/engines) used something like 1:8, 1:4, 1:4 for
initial consonants, medial vowels, and final consonants, respectively. A
Korean variant of xterm (a terminal emulator for X11 window system) has
been using fonts with 1:10,1:3,1:4 ratio. Some high quality true-type
fonts for Hangul these days (internally) have 1:n (n ~ 30), I believe.

> ---------- Forwarded message ----------
> Date: Fri, 10 Sep 93 14:12:07 -0400
> From: jage (James E. Agenbroad)
> Subject: Some Character to Glyph Statistics

> Recent Internet discussions about fonts for ISO10646/Unicode prompted
> me to do some counting. The data are suggestive rather than definitive
> at least in part because the counts of glyphs are based on only a single
> source and it may not be up to date. They do suggest that for various
> writing systems of South (and maybe Southeast) Asia based on Indic scripts
> the ratio of coded characters to glyphs is not 1:1 but 1:2 or even 1:3.

 I thought (without any basis and hard data. that is, it was just my wild
guess) the ratio would be much higher than 1:3 for Indic scripts.
With the ratio being only 1:3 or so, I guess Indic scripts are in much
a better shape to be supported than medivial (and some elements of
modern) Korean. Projects like Pango (http://www.pango.org) have already
begun to support Indic and Thai scripts let alone other commercial and
non-commercial implementations (Uniscribe,AAT, Graphite,...). Therefore,
eight years since your original message haven't been wasted, I think :-)

  Jungshik Shin



This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:18:17 EDT