From: Kenneth Whistler (firstname.lastname@example.org)
Date: Wed Oct 24 2007 - 13:41:13 CDT
Tim Armes asked:
> I'm looking for accurate answers to the following questions. I've
> spent a lot of time trying to find this information but it doesn't
> appear to be readily available.
In part that is because the questions are not completely well-formed,
and even if fixed, the answers are problematical. And the request for
*absolute* answers is probably doomed.
> 1) How many and which languages absolutely require the use of
> combinging marks due to the fact there the pre-composed glyphs
> aren't sufficient?
A. This question isn't really about *languages*, but about writing systems
(or orthographies) used to write languages. As an example, take
standard Mandarin Chinese. If written with the Han writing system
(the Chinese ideographic characters), it basically requires no
use of combining marks. If written with the Pinyin Latin orthography,
there are precomposed characters for all the letters. If written
in IPA (also Latin), then you would need lots of combining marks.
B. The issue isn't precomposed *glyphs*, but precomposed *characters*.
Sequences of base letter plus combining mark(s) may end up
being displayed with precomposed *glyphs* from a font, in any
case. The glyphs themselves are a matter of the font design
and mapping, whereas the characters are a matter of the
character encoding and are what you store in text strings.
C. Stating this as which "absolutely require" will end up getting
you unclear answers, because you can generally find edge
cases of usage which would result in someone using a combining
mark. What you are actually after is the answer to a "typically
require" question, instead. And for that, you can give
a general answer: All of the non-Latin writing systems of
South and Southeast Asia typically require the use of
combining marks. Arabic (script -- which is used to write
many distinct languages) also typically requires the use of
Hebrew typically doesn't require combining marks -- but
it *absolutely* does, because pointed Hebrew isn't that
uncommon for some types of materials, is part of the
writing system, and requires combining marks.
For the Greek script you can generally get by without
combining marks, but the preferred representation of
polytonic Greek is with combining marks.
For the Latin script, the answers are very difficult to
come by. Most major European languages can be written
without combining marks, but there are thousands of
Latin-based orthographies in use around the world, and
many of those -- even some for very large, official
languages in Africa, for instance -- require some use
of combining marks.
> 2) How many and which languages absolutely require the
> use of variant selectors?
At the moment, variation selection sequences are only
defined for Mongolian and for Phags-pa scripts. (Both of
those scripts are used to write several languages -- see
the Unicode Standard or other references for details.)
A large set of variation selection sequences are in
the process of standardization for CJK ideographic
characters -- but the intent there is not to *require*
their use, unless you are explicitly want a very exact
choice of glyph in some point in text.
Andrew West clarified the situation for Mongolian and
> 3) How many and which languages absolutely require the
> use of variant glyphs?
Not answerable without further clarification of what kind
of requirement you have in mind.
Note, for example, that Latvian in some sense "requires"
an alternative glyph for U+0123 LATIN SMALL LETTER G WITH
CEDILLA for good typography, but you can get by without
it and still have legible text.
This archive was generated by hypermail 2.1.5 : Wed Oct 24 2007 - 13:42:56 CDT