Question about Arabic block description in Unicode 2.0

From: Yung-Fong Tang (ftang@netscape.com)
Date: Mon Sep 13 1999 - 10:58:30 EDT


Arabic experts:

I try to implement Arabic shaping algorightm by following the Arabic
block descript in Unicode 2.0 (Page 6-22 ~ 6-32.I found one problem and
also have some suggestions:

Problem 1:

Page 6-24:

     Table 6-3 Arabic Joining Classes
     Joining Class
     Right-joining
     Left-joining
     Dual-joining
     Join-causing
     Non-Joining
     Transparent

     In addition to the above classes, two superset classes will be
     employed as follows: a right-join causing character is either
     a dual-joining, right-joining or join-causing character; a
     left-join causing character is either a dual-joining,
     left-joining or join-causing character. Here right and left
     refer to visual order.
     [...]
     R2 A right-joining character X that has a right join-causing
     character on the right will adopt the form Xr.
     [...]
     R3 A left-joining character X that has a join-causing
     character on the right will adopt the form Xl.
     [...]
     R4 A dual-joining character X that has a join-causing
     character on the right and a join-causing character on the
     left will adopt the form Xm.
     [...]
     R5 A dual-joining character X that has a join-causing
     character on the right and no join-causing character on the
     left will adopt the form Xr.
     [...]
     R6 A dual-joining character X that has a join-causing
     character on the left and no join-causing character on the
     right will adopt the form Xl.

The whole paragraph is very confusing because there are a character
class call Joining-causing but there are also the superset classes
rightjoin causing and leftjoin causing class there. The original text
replace "rightjoin causing character" and
"leftjoin causing character" with "join-causing character" instead.
Also, the definitation of rightjoin causing and

I suggest we change to the following

     Table 6-3 Arabic Joining Classes
     Joining Class
     Right-joining
     Left-joining
     Dual-joining
     Join-causing
     Non-Joining
     Transparent

     In addition to the above classes, two superset classes will be
     employed as follows: a right-join causing character is either
     a dual-joining, left-joining or join-causing character; a
     left-join causing character is either a dual-joining,
     right-joining or join-causing character. Here right and left
     refer to visual order.
     [...]
     R2 A right-joining character X that has a right-join causing
     character on the right will adopt the form Xr.
     [...]
     R3 A left-joining character X that has a left-join causing
     character on the right will adopt the form Xl.
     [...]
     R4 A dual-joining character X that has a right-join causing
     character on the right and a left-join causing character on
     the left will adopt the form Xm.
     [...]
     R5 A dual-joining character X that has a right-join causing
     character on the right and no left-join causing character on
     the left will adopt the form Xr.
     [...]
     R6 A dual-joining character X that has a left-join causing
     character on the left and no right-join causingcharacter on
     the right will adopt the form Xl.

Notice th RED text is what I changed

Arabic experts, is this correct ?

I found it is difficult to following because of inconsistense of term
and symbol. Below is my suggestion:

Suggestion 1:
The notation used in this text is very inconsistence- For exmaple, it
use the followng three way to mark Glyph types

a) page 6-25

Glyph Types
Xn
Xr
Xl
Xm

b) page 6-27
CHAR .N
CHAR .R
CHAR .M
CHAR .L

c) Arabic Presentation Forms-A ( page 7-484 ->7-501 ) and Arabic
Presentation Forms-B ( page 7-509 ->7-512 )
<isolated>
<final>
<initial>
<medial>

There are no reason we should not use the same notation. Please unify
this in Unicode 3.0

Suggestion 2: The Table 6-9 use R, D, C, U to mark Link Howerver, there
are no explaination what they mean. I suggest we add the following note:

R: Right-joining
D: Dual-joining
C: Join-causing
U: ??? (I have no idea what this mean)

Also, there are no explaining about what is the "Link Group" in the
table for. Do we really need this column ?

It will also be nice if someone can contribute Logical Order Basic
Arabic to visual order Arabic Presentation Forms B code as example here.
I have hack some code but it is not complete and need Arabic experts to
review and fix it.





This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:51 EDT