From: Roozbeh Pournader (roozbeh@htpassport.com)
Date: Thu Apr 30 2009 - 16:04:54 CDT
On Thu, 2009-04-30 at 15:21 +0200, Titus Nemeth wrote:
> I have a manuscript in Berber language of the poet Muhammad al-Awzali in
> Arabic script and want to type it. It contains a few of the "Berber"
> characters (Kah with three dots below etc.), among them a miniature Ayn.
> I was not able to find it encoded in the Unicode charts and also the
> list-archives did not show results to my queries. One of the words that
> use the letter is for example:
>
> "miniature Ayn" (Fatha) + Alif + Yeh (Sukun) + Lam (Shadda/Fatha) + Nun
> (Sukun)
Would it be possible for you to upload a scan or photo of the word
somewhere and send the list a link to the image?
> I am not familiar with Berber languages which makes it more difficult to
> find out about this. I saw the use of a Greek Epsilon on a "TAMAZIGHT"
> website, but doubt that this is conventional.
It is definitely not conventional, especially if you want to put a Fatha
over the letter. I would not recommend using the Greek Epsilon for this.
> The only potential Unicode I found is U+01B9
>
> 01B9 Y LATIN SMALL LETTER EZH
> REVERSED
> • archaic phonetic for voiced pharyngeal
> fricative
> • sometimes typographically rendered with a
> turned digit 3
> • recommended spelling 0295 Z
> → 0295 Z latin letter pharyngeal voiced
> fricative
> → 0639 arabic letter ain
>
> Yet, I do not understand the relation to Ayn and whether this code would
> actually be used in my context.
There are two relation to Arabic Ain:
* This Latin letter has been historically used to transcribe the sound
of Ain (note: some/most linguists say that Ain in Arabic is not
pharyngeal but epiglottal instead).
* It looks similar to Ain, and its original shape may be based on Ain.
But U+01B9 should not be used for your purpose either. This is clearly a
Latin letter.
> Moreover, I wonder about the encoding of Feh with dot below (06A2) and
> the Qaf with a single dot above (06A7). As far as I have understood
> (correct me if I'm wrong), those two letters are only graphically
> distinct from the regular Feh (0641) and Qaf (0642).
Unicode tends to encode the Arabic script more graphically than some
would expect.
Another commonly-cited case is the case of U+0643 ARABIC LETTER KAF vs U
+06A9 ARABIC LETTER KEHEH. In some languages, the glyph shapes used in
Unicode charts are both considered OK, while there is usually a
preference for one of the forms over the other.
There are various reasons some of these pairs have been encoded
separately. For example, some languages may use both forms with a
phonemic or semantic difference. For example, while U+06CC ARABIC LETTER
FARSI YEH and U+06D2 are considered graphical variants in Persian, their
distinction is important in various South Asian languages written in the
Arabic script.
Generally, I would recommend encoding the text graphically if your
readership would be specialists: If the source material puts a dot under
the Feh, use U+06A2. That way, you would keep the distinction in the
source material. You can also provide a standardized/simplified version
to ease searching with software tools that don't know there is a
relation between U+06A2 and U+0641, or for cases when fonts to render
the text are hard to find.
Still, if the text is to be read by the general public only, you may
want to only use the standardized orthography of the common language. I
would not use U+063C ARABIC LETTER KEHEH WITH THREE DOTS BELOW if I'm
typing a classic Persian poem from a manuscript for my weblog. I would
use U+06AF ARABIC LETTER GAF. I would only use U+063C in documents where
I wish to discuss the specific classical orthography that has used three
dots under the letter Skeleton.
> I also wondered wether the Unicode values for these letters are actually
> used by anyone?
Oh, definitely. To cite an commonly available resource, you can usually
find Wikipedia articles using such characters easily.
But generally, fonts and keyboards are usually the barrier for adoption
of Unicode characters. Until there is an easy way to enter and display a
certain character, users tend to avoid it.
Roozbeh
This archive was generated by hypermail 2.1.5 : Thu Apr 30 2009 - 16:09:20 CDT