Re: Greek accentuation marks

From: Kenneth Whistler (kenw@sybase.com)
Date: Wed Oct 28 1998 - 13:13:21 EST

Next message: Julia Oesterle (Unicode): "RE: ask for help"
Previous message: Constantine Stathopoulos: "Greek accentuation marks"
Maybe in reply to: Constantine Stathopoulos: "Greek accentuation marks"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Constantine Stathopoulos commented:
>
> Also in ISO 10646, it is not character 03B9 that is used in the
> decomposition of example 3, but character 1FBE, which has been
> encoded as a separate diacritic. It should also be noted that
> despite what the Unicode charts show, ISO 10646 does not contain
> any character like the one depicted in example 2, but 1FA8 is the
> combination of 03A9,0313 and 1FBE.
>

ISO/IEC 10646-1 does not specify a decomposition for *any*
characters. The concept of decompositions and of canonical
equivalences is introduced by the Unicode Standard -- not by 10646.

With respect to the example being discussed, 10646 has two characters:

U+1FA0 GREEK SMALL LETTER OMEGA WITH PSILI AND YPOGEGRAMMENI
U+1FA8 GREEK CAPITAL LETTER OMEGA WITH PSILI AND PROSGEGRAMMENI

The names themselves do not specify decompositions.

10646 also does not specify case mappings, although we can surmise
from the names that U+1FA0 is intended for lowercase and U+1FA8
for uppercase. Clearly, however, based on the layout of the table
provided by ELOT for the 1FXX block, the intent is that the
polytonic combinations come in case pairs, and U+1FA0/U+1FA8 is
intended to be a lowercase/uppercase pair.

The *Unicode* decomposition of U+1FA0 is:

U+1FA0 = 03C9 + 0345 + 0313 (in canonical order)

i.e. small omega + iota subscript + raised comma above (= psili)

The *Unicode* decomposition of U+1FA8 is:

U+1FA8 = 03A9 + 0345 + 0313 (in canonical order)

i.e. capital omega + iota subscript + raised comma above (= psili)

The reason for this is that decompositions involving diacritics on
a baseform letter should preserve the casing distinction in the
baseform letter; i.e. if I perform a case transform on the baseform
letter of the decomposed sequence, I should get the same result as
if I perform the same case transform on the precomposed equivalent
of that sequence.

The character sequence involved in the decomposition is a completely
separate issue from what the resulting *glyph* appearance should be.
On this, I concur completely with you that the *glyph* for U+1FA8
should accord with "Greek grammar" and should show the psili to the
left of the capital omega (not over it), and should show the iota
as a prosgegrammeni, small and to the right foot of the capital
omega, and not underneath it. [I am working with others on the editorial
commmittee to ensure that the next edition of the Unicode Standard
shows U+1FA8 and other similar Greek letters with the prosgegrammeni
glyph forms, matching those shown in 10646 instead of those shown in
Unicode 2.0.]

> but 1FA8 is the
> combination of 03A9,0313 and 1FBE.

Actually, the intent of ELOT would be that 1FA8 consists of:

1FBF + 03A9 + 1FBE

not

03A9 + 0313 + 1FBE

since the "psili" for the uppercase form is represented as a spacing
form to the left of the Omega.

This is not, however, how Unicode decompositions work. Keyboard
input drivers could choose to work that way, in terms of the spacing
forms of the diacritics for polytonic Greek, but in a Unicode implementation,
the preferred form of textual representation would either be as the single
precomposed character U+1FA8 or as the Unicode decomposition I specified
above.

Regards,

--Ken Whistler

Next message: Julia Oesterle (Unicode): "RE: ask for help"
Previous message: Constantine Stathopoulos: "Greek accentuation marks"
Maybe in reply to: Constantine Stathopoulos: "Greek accentuation marks"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:42 EDT