Re: Uyghur HEH vs. AE

From: becker.osbu_north@xerox.com
Date: Tue Aug 20 1996 - 18:24:43 EDT


Tom Milo, 6 Aug 96:
> Uyghur uses HEH U+0647 from the Arabic core set to represent the consonant
> phoneme /h/, and AE U+06D5 to represent the vowel phoneme /e/. The
> isolated and final forms of AE are identical to those of HEH. However, AE
> is a disconnecting letter, i.e., it behaves like ALEF, DAL en REH. Therefore
> AE has only final and isolated forms, followed by the initial form of the
> next letter.

Right.

Michael Forgey, 7 Aug 96:
> I believe that the Unicode manuals explain that AE (06D5) does not
> undergo any (contextual) shaping behavior. [The Unicode Standard 2.0, draft
> 12/1/95, p.4-33; and The Unicode Standard 1.0 Vol. 2, p.398.]

Michael Forgey, 9 Aug 96:
> ... the Unicode manual
> seems to say that AE 06D5 does not connect at all. But this does not
> seem to be true. Is the Unicode manual in error on this point?

Yes, it is an error, almost certainly my fault, thank you for finding it!

In the linking-class listing in The Unicode Standard, Version 1.0, Volume 2,
page 398, the very last entry U+06D5 ARABIC LETTER AE is listed as "U <no
shaping>". Instead it should be "R TAA MARBUTAH", and ARABIC LETTER AE should
be shown in the TAA MARBUTAH link group on p. 394. I do not have the final
text of the v2.0 standard, but the error has likely been transcribed there too
as you suggest.

Tom Milo, 7 Aug 96:
> ... the right glyph assimilation pattern ("context analysis"). This is
> IMHO what is not well-defined in the Arabic Encoding of UNICODE/10646.
> A clear indication which pattern is needed for connecting the letters is
> needed.

? Did you find the "Arabic Character Shaping" section of the Unicode Standard,
Version 1.0, Volume 2, pp. 388-398, to be unclear? Or perhaps we successfully
hid it from you behind all that Chinese?!

Michael Forgey, 9 Aug 96:
> I was referring
> to the image of the glyphs used for the isolate and final HEH; (i.e.,
> in the Uyghur character charts I have the seen, isolate HEH uses the
> same glyph as the initial HEH ...

You may have run afoul of the convention that the isolate HEH is often
illustrated in character code charts via the glyph for initial HEH, a tradition
arising from the desire to distinguish HEH from the digit 5. But it just makes
things more confusing, heh? At any rate, the codes you name for U+0647
Arabic/Uyghur HEH are correct, plus:

U+06D5 ARABIC LETTER AE // the abstract letter

U+FEE9 ARABIC LETTER HEH ISOLATED FORM // (looks sort of like "o") glyph also
for isolate & initial ARABIC LETTER AE
U+FEEA ARABIC LETTER HEH FINAL FORM // (looks sort of like "a") glyph also
for final & medial ARABIC LETTER AE

U+FEEC ARABIC LETTER HEH MEDIAL FORM // this was the glyph for FINAL FORM in
Unicode v1.0 ! (*)

(*) Just to trip up anyone who had made it through the minefield this far, at
ISO's [expletive deleted] behest Unicode CHANGED THE CODE ASSIGNMENTS of most
of these Arabic glyphs between v1.0 and v2.0 !

"If it was easy, anyone could do it,"

Joe



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:31 EDT