Re: The mystery of Edwin U+1E9A

From: Philipp Reichmuth (uzsv2k@uni-bonn.de)
Date: Tue Aug 20 2002 - 09:29:55 EDT


>>Semitic transliteration practice, if I recall correctly.

RM> It is common enough in transcribing Hebrew and Arabic.

A single character "a" with a half-ring to the upper right or on top
of it? What would it stand for in Arabic transliteration, as opposed
to separate characters "a" and "half-ring"? (It's just that I
currently don't recall ever seeing this in Arabic transliteration
practice. Maybe you could give an example?)

As far as UI can see, for Arabic and Hebrew (and Semitic in general)
transliteration practice, U+1E9A is not needed at all.

RM> There is however a need (in Semitic transcription) for a corresponding
RM> series with the hook facing the opposite way, to represent the ain with a
RM> following vowel. With the present Unicode code points, one has to insert
RM> the ain U+02BF before the vowel, although it has much "right" as the hamza
RM> U+02BE to be combined with the vowel.

Why should there be a character for the combination of hamza + vowel
or vowel + hamza in transliteration? I don't think this makes any
sense whatsoever. In the graphical appearance of transliterated text,
the hamza/ain and the vowel are usually transliterated side to side.
Could you give an example in literature where both are combined? Does
this mean "hamza + vowel" or "vowel + hamza", as the decomposition
hint on U+1E9A seems to indicate?

If the combination *were* needed, why would "vowel + U+0313" or "vowel
+ U+0314" not do the task? What is the benefit of precomposing this,
as opposed to encoding the hamza and the vowel separately?

As far as the "right to encoding" is concerned: if one intended to be
consistent about it (notwithstanding the fact that Semitic
transliteration practice, to my best knowledge, does not use such a
character), one would at least need all the Semitic consonants with
hamza, too, since they can meet at syllable boundaries, as well as all
the vowels a through u in long and short variants. Oh, and with ain,
too.

RM> Latin Extended Additional is very odd and unsatsfactory
RM> collection,

It is, but seeing that precomposition appears to be a principle of the
past, Latin Extended Additional isn't going to change, and it's
probably a good idea to use combining characters instead... (even
though there appears to be no font whatsoever at present that gives
pleasant output)

  Philipp



This archive was generated by hypermail 2.1.2 : Tue Aug 20 2002 - 07:47:41 EDT