Re: Variant selectors in Mongolian

From: Martin Heijdra (mheijdra@princeton.edu)
Date: Thu Jul 11 2002 - 09:20:21 EDT


Thanks to all for discussing this (which a friend of mine remarked were the
a-r-c-a-n-/-a of Unicode.)

Yet:

1. I think I now get the message that the intent for Mongolian is actually
to mark only irregular cases, which I think is the approach taken by some
implications I have seen (though I have no details on the Fangzheng Mengwen
chuban xitong software, probably the main software used in China). At least,
that seems the conclusion if I take Ken's statements as authoratative, which
I am quite willing to do of course.

However, when I read that there is still some "tension", does that mean this
solution is still undecided? From my outsider's perspective, there IS now a
textual explanation given in Unicode 3.2, AND a table, but they seem
conflicting, and no details are given. That is, it seems there IS a standard
Unicode solution, I only can't deduce it from the text cum table, which is
rather undesirable for a standard. Could this section be updated with actual
"examples" from Mongolian? Such examples should perhaps include cases like
ana-anda, and the exceptions as given in the gif given in the first message.

2. Some further complications:

> When applied to Mongolian (or in principle any script like Mongolian),
> where a character is subject to positional shaping rules, you have
> to consider that character X is associated with, for example, a
> *set* of glyphs X -> {G1, G2, G3, G4} depending on positional contexts.
> A variant of character X might be associated with a variant *set*
> of glyphs, some of which could overlap, e.g. X-/ --> {G1, G2', G3', G4},
> so that the glyphs for the variant might not contrast in all
> positional (or other) contexts.
>

I actually think this is not practically likely: I don't think any
application would use a variant selector X-/ when the glyph would be the
same as X--there is just no reason to mark as special an unspecial glyph,
and the user would not type this--after all, both are the same character.
More likely it would be to say that X-/ --> {G1', G2', G3', G4'}, where
however from a purely glyph perspective (say dotted versus undotted), {G1,
G2', G3' , G4} might be the dotted set, and the {G1', G2, G3, G4'} the
undotted one.

Moreover, the table given in Unicode 3.2 actually only literally defines the
variant selector for some, and no other positions (e.g., medial and final
only, no initial, etc.) Or is that also not strictly to be followed?

3. >I agree. Although variation selectors also imply willingness to accept
>fallback to default glyphs as legible alternatives, if not the
>desired alternatives.

Even that might not be true in Mongolian. The thing is, of course, that most
of the glyphs for the -a- and -n- are the same, as if they came from the
same X and X' sets (a fact which was irrelevant for the discussion so far.)
The reason why there are irregular variants are most often because in
foreign words the Mongolian rules for syllable/word formatio are broken, and
that's why no longer one can decide whether the usual glyph is to be read as
an "a" or an "n". That's why the irregular (because usually, unnecessary)
dots came into being. And that's why words without variant selectors might
not be legible...

Martin Heijdra



This archive was generated by hypermail 2.1.2 : Thu Jul 11 2002 - 07:45:58 EDT