Greek Prosgegrammeni

From: Lukas Pietsch (
Date: Tue Nov 07 2000 - 07:47:15 EST


as a newcomer to this list, let me address a question that was probably
discussed a long time ago but seems to be still not quite solved, the
question of the encoding of the "iota prosgegrammeni" characters in Greek
Extended. Apparently a lot of confusion has followed from an initial
misunderstanding that a "iota prosgegrammeni" ("adscript iota") is a
diacritic that looks similar or alike to a "iota ypogegrammeni"
It doesn't. Correct me if I'm wrong, but the only "adscript iota" I know of
in traditional Greek orthography is simply a normal, full-sized iota glyph
(lower-case if the word is title-case or lower-case; upper-case if the word
is upper-case). The only difference between such an "adscript iota" and a
normal iota seems to be that the adscript is ignored in collation. The
adscript iota obligatorily replaces a "subscript iota" in titlecase or
uppercase, whereas it can also be used as an optional, slightly archaic
variant instead of the subscript in lowercase.
If anybody has evidence that small, diacritic-like iota glyphs were ever
used with capital base letters in Greek writing, please let me know and
ignore the rest of this message.
The misunderstanding of the diacritic-like adscript iota is unfortunately
still spreading through the world because the unicode demonstration charts
show it this way. Most font designers have followed what the charts seemed
to dictate to them (even when they knew better), with the result that now
there are very few fonts that show these characters correctly. Microsoft's
"Palatino Linotype" is an exception, as is James Kass's "Code2000" in its
most recent update.

But the real question I'd like to raise is that of the character properties
defined for these characters in Unicode.

The current version correctly states that the standalone "GREEK
PROSGEGRAMMENI" (u+1FBE) is canonically equivalent to a lower-case "GREEK
LETTER IOTA" (u+03B9).

However, the precomposed characters containing the prosgegrammeni, e.g.
decompose to base letter + "COMBINING GREEK YPOGEGRAMMENI" (u+0345), as if
prosgegrammeni and ypogegrammeni were the same thing. This means that, even
if I have a font that shows u+1FBC correctly, if my text undergoes a
canonical decomposition the incorrect subscript glyphs will reappear.
Is there a logic to this that I don't understand, or is this just a hangover
from the time when people did think prosgegrammeni and ypogegrammeni were
the same thing? Wouldn't it be much more logical if precomposed capital
letters such as u+1FBC decomposed to base letter + "GREEK PROSGEGRAMMENI"
(u+1FBE), or directly to base letter + "GREEK LETTER IOTA" (u+03B9)? To
ensure correct case conversion, it would probably be necessary then to
introduce another special casing rule, making sure that "COMBINING GREEK
YPOGEGRAMMENI" (u+0345) gets mapped to "GREEK PROSGEGRAMMENI" (u+1FBE) when
its preceding base letter gets title-cased.
Something like this is already being done for upper-case anyway.

Does this make sense, and if yes, is there any way of getting it into the

Lukas Pietsch
Ferdinand-Kopf-Str. 11
D-79117 Freiburg
Tel. 0761-696 37 23

Universitšt Freiburg
Englisches Seminar

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:15 EDT