From: Gregg Reynolds (firstname.lastname@example.org)
Date: Wed Mar 02 2005 - 14:04:33 CST
Tom Emerson wrote:
> Gregg Reynolds writes:
>>I see in the new version that teh marbuta (0629) is still listed as
>>right-joining. This is arguably incorrect with respect to written
>>Arabic (or at least not such good design); it would be more usefully
>>construed as a dual joiner. I don't know how it plays in other
>>languages that use the Arabic script.
> I'm interested in your arguments that teh marbuta isn't right joining:
> how is this not correct? It can only appear word-final, which
> precludes its use as a dual joiner.
Not quite; depends on what you mean by "word". I'll give you a simple
example to illustrate; a more detailed explanation would involve an
explanation of how spoken Arabic works and how it is represented in
written Arabic, which I'd be happy to provide if you're interested, but
for now let's stick with an example.
The word "risala#" (رسالة) means roughly "letter, message". (I use # as
teh marbuta.) Pronounced in isolation, the word ends in a soft 'h'
sound - which is why the teh marbuta (in this form) looks like a 'heh'
(ه). Suffix the word with a personal pronoun (indicating possesion) and
you get "risalat*kum" (رسالتكم) (I use * to mean any short vowel). The
pronunciation is /t/, just like the teh (ت). But the identity of the
"character" has not changed; it is still teh marbuta, which means "bound
or joined teh". Note that the iso/fin form combines the shape of the
heh and the two dots of the teh. Note also that teh marbuta is not
traditionally considered a first-class letter in the abjadia; instead is
is a clever solution to the problem that a single character (in the deep
orthography, if that's the right term) takes two completely different
pronunciations depending on context. I suppose the linguists have a
word for this sort of thing; to me it looks like teh marbuta makes
explicit a feature of deep orthography, or morphology, or in any case
it's semiotics (can you tell I'm grasping here?) differ from those of
the "normal" letters. This is in Arabic; I dunno about Persian, etc.
In other words, it would be useful to encode the *character* teh
marbuta, as understood in Arabic tradition. So e.g a search for
risala# should match risalat*kum, and when the -kum is deleted in an
editor the software knows the shape of the # should revert to the
I suspect this might entail disunifying teh marbuta as used in writing
the Arabic language from the 'heh-with-two-dots' used in other languages
that use the Arabic script.
Hope that helps.
This archive was generated by hypermail 2.1.5 : Wed Mar 02 2005 - 14:05:25 CST