> On Sat, 17 Mar 2001, Marco Cimarosti wrote:
> > Do you know how teh marbutah sorts in Arabic or other languages written in
> > Arabic script?
> In Arabic it's sorted before Teh, I believe.

Not considered a "character" in Arabic; it's not part of the abjad. Doesn't sort at all, in other words. That's in Arabic, mind you, not computerese; Arabid doesn't do "alphabetic" sorting. Although I guess you could say it always sorts after the same form without the teh marbuta; kitAba# comes after kitAb.

> > BTW, is it correct that teh maruta in Farsi only occurs in words of Arabic
> > origin?
> Yes, and they are becoming fewer with time, they become Heh or Teh. The
> most common word using Teh Marbuta is "daa'erat-ol-ma'aaref"

In the two Arabic dictionaries I checked briefly, daa'erat was placed in the midst of a bunch of other d-w-r words, some of which terminate in teh marbutah. The ordering is based on morphology and semantics, not on alphabetic ordering. For example, teh marbuta is often suffixed to a generic noun to make it a unit noun: Sadm = verbal noun meaning roughly the action of striking a blow or colliding; Sadma# = a single blow or strike, etc.

Example of semantic tah marbuta sorting, from a modern Egyptian-English dictionary (using 9 as ain, # as tah marbuta): raba9, rab9, rab9a#, rub9, rub9a#, ribi9, arba9, arba9a#, etc.

Point of interest: all the great Arabic lexicons, of which there are many, sort based on radical structure. But they disagree on how to order the radicals. Most go first to last (this is used by most dictionaries today). But one of the best known, "Lisaan al-Arab", uses last-first-middle ordering.


