# Re: arabic number in bidi algorithm

From: Gregg Reynolds (greynolds@greynolds.com)
Date: Thu Oct 28 1999 - 23:04:04 EDT

> On Thu, 28 Oct 1999, Reynolds, Gregg wrote:
>
> > To cut a long story short, it makes perfect sense to construe Arabic numeric
> > forms as right-to-left; they just happen to put the least significant digit
> > first. This means that under Unicode's "logical order" philosophy, least
> > significant digit should come first in an encoded representation, and
> > presentation logic need not reverse anything. This also means that software
> > designed only to handle Arabic need not worry about any bidi behavior.
>
> No! What are you telling us? I can't understand you even a bit. I write in
> Arabic script usually, except when I'm writing email. No. Logical order is
> "most-significant-first" in the world of Arabic script. And to extract the
> numerical value out of it, we perhaps need them to be encoded MSD first.
>
> There is a need for a bidi algorithm in a good arabic encoding.
>
> --Roozbeh

But ask your Arabic teacher - I mean somebody with not only native-speaker
competence, but with deep expertise in classical grammar - how the Arabs of old
would have pronounced a number like "1999". The answer you should receive is
(tranlated literally) "nine and ninety and nine hundred and one thousand".

I don't question your personal experience - the fact is virtually all modern
speakers of Arabic read numbers in the European manner (with one small difference
I'll explain in a minute). By your testimony, and that of a Pakistani colleague
of mine, this is also the case for Persian and Urdu speakers. But for Arabic, at
least, this is clearly an artifact of European political and economic hegemony.
Things may be different among speakers of Indo-european languages using Arabic
characters - after all, Arabic was itself an imperial language at one point;
Arabic.

Or: there are two ways to think about (decimal) numeric forms, the mathematical
and the linguistic. The mathematical syntax of numbers assigns a numeric value to
each position and orders the positions in some way - left to right in the case of
both modern English and classical Arabic. The linguistic interpretation is left
to right in the former, right to left in the latter.

I don't think you would have much trouble finding traditionalistic teachers in
Cairo, Damascus, or Baghdad who teach literary Arabic with numbers spoken least
significant digit first. I've verified this with a native speaker who has taught
Arabic at University for many years (I withhold his name to protect the innocent.)

The difference I mentioned above is that in (modern) Arabic the last two digits
(tens and ones) are enunciated ones first, then tens. So "1987" is verbalized as
"one thousand and nine hundred and seven and eighty". And among some at least it
is physically written in the same order - skip to the left, write "19" moving to
the right, then skip to the right and write "7" in the ones position then "8" in
the tens position. Which rather inconveniences Unicode's notion of "logical"
order.

One of these days in the next, oh, 50 years I'll slap some supporting evidence on
a web page. But just to give you an idea of what this means: a well-known modern
Arabic Grammar on my bookshelf consists of four volumes of roughly 800 pages
each. The section on "Number" runs to over 70 pages. (The numbers are the most
difficult part of Arabic grammar, by a large margin.) Wright's Grammar,
standard in the English speaking world, has some info but it's scattered
throughout. He states that numbers can be read either way, but most of his
examples use left-to-right reading. But then Wright was an Englishman translating
the work of a German at the height of the British Empire. Draw your own
conclusions.

-gregg

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:54 EDT