>The current argument on the Arabic ligatures appears very strange to me,
>and culturally biased. The Unicode standard defines them as presentation
>forms, and specifies their equivalence to their preferred encoding with
>basic letters. Thus, they are clearly specified to be equivalent to a
>sequence of basic characters, and anyone who can render Arabic correctly
>should be able to render them without difficulty.
>On the other hand, Unicode and 10646 contain hundreds of pre-composed
>Latin, Cyrillic and Greek letters, equally superfluous, equally
>decomposable, and this is acceptable because this is what our Western
>colleagues are used to.
I whole-heartedly agree. I haven't looked at Unicode 2.0, but I see
A acute, a acute, E acute, e acute, I acute, i acute etc., etc., in
Unicode 1.0, encoded as single characters. There is also the stand-alone
Why is then Devanagari forced to represent its ligatures as multiple
characters, to be deduced from the character encoding, and with the
requirement of (paraphrasing Glen Adams' words) of "complex character
encoding to glyph translation" schemes ?
If Latin was encoded with the same regard that is given to Devanagari,
then there would be no A acute character, it would have to be entered
as <A> + <acute sign>. What glyphs are to be rendered are as easily
deduced from the character encoding as is anything in Devanagari. And,
I believe, the way A acute is entered into a tool like TeX is
as \'A -- essentially, a two-character encoding. Instead, Unicode 1.0
has a glyph encoding for all of the letters modified with the acute,
grave, circumflex, etc. etc. signs.
For any other language, Arabic or Hindi, to have a glyph encoding,
however, is a no-no, and we are told to consider the allographic
versus the graphemic, to stop thinking like font designers, etc. etc.
There is no rationality in this. I hope that these irrationalities in
the Latin encoding have been removed in Unicode 2.0. Or else, I hope
the purveyors of graphemic purity have the grace to blush.
" The Unicode standard ends up straying far from the ideal with respect
to a number of basic policies, all in the interests of devising a
small enough character set.
These compromises, moreover, were made not from a neutral standpoint
but with the linguistic biases of people in the Latin language sphere
(especially the English language sphere)."
I am afraid that as more non-Western people become aware of the Unicode
standard, they are going to be easily convinced of the truth of above
statement. Right now, the software industry and its standards are in
the custody of the western nations, but that will not be forever.
In other parts of the world, scripts even have religious significance;
I do not think people will take poor representations of them lightly.
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:33 EDT