From: Asmus Freytag (email@example.com)
Date: Sat Mar 19 2005 - 14:12:12 CST
At 01:42 AM 3/19/2005, Ahmad Gharbeia wrote:
>While the mentioned letters' names in their current incorrect state
>reflect the colloquial pronunciation in Egypt, where I am from, they
>are not the canonical, globally understood letter names and are
Presumably, the reason we have the current form of the names is
due to contributors from Egypt in the very early work of encoding
characters. Because of the merger of efforts between ISO and the
Unicode Consortium on character encoding, character names for
the Unicode Standard match the names used in ISO/IEC 10646. That
standard in turn intentionally matches the names used in 8-bit
character set standards, such as ISO/IEC 8859.
The names used for Arabic characters in Unicode therefore ultimately
have a heritage that can be traced back several decades. It is ironic
that early drafts of the Unicode Standard indeed used the names that
>While the proposed corrections do not aim to
>precisely transcribe the sounds of the letters, they are simple to
>implement and would result in identifiable names of the letters.
The purpose of the names in the Unicode Standard is twofold. On the one
hand we desire them to be descriptive so that they can be used as a
convenient handle for the character in discussions and descriptions,
or help users identify them in a list of characters.
On the other hand, they are intended to serve a formal identifiers,
just as scientific names for plants and animals. This is especially
important for characters that are also part of other ISO standards,
where they have different code numbers, but the same name.
As a required for this second use, names, once assigned, cannot be
changed, even to the limit of preserving a typo, as in the name for
U+1D0C5, or in preserving the name of U+2118, which describes something
different from what the character actually is.
>Although it is unlikely that this heritage of earlier encodings can
>be modified now, this should be noted, however.
An annotation or comment to the effect that the names represent
a less than universal transliteration is always possible.
>Finally, the order of Arabic letters as defined in the current version
>of Unicode, known as the Hegaa'i order, is a relatively newer order
>where letters are sorted according to their shape proximity, and is
>not the original Abgadi order, which matches the (ABC) ordering of all
>alphabets derived from the original Ugaritic alphabet.
That is something that might be noted as well. There are other scripts
for which the basic alphabet has more than one possible order. As the
ordering affects primarily the users of the printed code charts trying
to locate a character, picking a more modern ordering seems to be
As Doug Ewell already wrote, the ordering of *data* is of course not
driven by the arrangement of characters in the code table, but I think
you were not implying that.
more modern arr
This archive was generated by hypermail 2.1.5 : Sat Mar 19 2005 - 14:13:44 CST