Fw: Encoding Bengali Vowel forms (again)

From: mijan (meejan@hotmail.com)
Date: Fri Apr 28 2000 - 17:38:50 EDT


----- Original Message -----
From: "Abdul Malik" <zzak@csi.com>
To: <opentype@list.sirius.com>
Sent: Friday, April 28, 2000 10:32 PM
Subject: Re: Encoding Bengali Vowel forms (again)

 Here is my response to Marco

 Marco said that:
 Abdul Malik wrote in his report:
> The problem
> Unicode allows conjunct part glyphs such as zophola to be
> formed only by placing the Virama sign ( >) between two
> consonants. When ‘zophola AA_sign’ is placed after Letter_E
> or Letter_AA it is not considered to form conjunct with the
> vowel, it only serves to function as a vowel modifier.
> The zophola-AA sign can not be included in Unicode as a vowel
> modifier sign however, as when placed after a consonant it is
> considered to have different semantics. It would also be
> illegal to place it after a vowel sign.

 Marco said:
  Assumptions #1 and #3 are totally false

 1) "Unicode allows conjunct part glyphs [...] to be formed only by
 placing the Virama sign between two consonants."

 Abdul says: OK. Cut the word ‘only’ from the above sentence

 3) "It would also be illegal to place it [a virama] after a vowel
 sign."

 Abdul says: I was not referring to ‘[a virama]’ I was referring to the
 ‘zophola_AA sign’.

 The point I was making is that this sequence is considered illegal in
 Bengali so it does not make sense to include zophola_AA in the Various
signs
 section of the Bengali unicode range.

 Assumption #2 is irrelevant: the precise grammatical or phonetic function
of
 characters is not an issue for encoding.

 2) "When ‘zophola AA_sign’ is placed after Letter_E or Letter_AA it
 is not considered to form conjunct with the vowel, it only serves to
 function as a vowel modifier."

 Abdul says: Irrelevant? The Unicode charts have been, were possible, laid
 out with Vowels, Consonant etc grouped together. Also the function of
 characters is an issue for rendering engines

 Unicode does not have a "syntax" that
 stipulates which sequences of characters are legal and which are not.

 Abdul says: We need legal sequences defined in Unicode for Indic rendering,
 otherwise how are we to program our rendering mechanisms? A programmer can
 not be expected to be an expert on every language.

 Conclusion
> ‘Vowel A_zophola_AA’ and ‘Vowel E_zophola_AA’ need to be
> included in the Bengali Unicode range as separate vowels.
> [...]

 I have no opinions about accepting or not this proposal.

 Abdul says: I need your opinions

 As I see it, zophola is just the special glyph used to represent the
 sequence of these two characters:

 09CD (B. SIGN VIRAMA) + 09AF (B. LETTER YA)

 The formation of this ligature can and should be totally *unconditional*: I
 see no valid reason to bother checking for special conditions.

 Abdul says: As I said, rendering machines need to check for special
 conditions.

 This means that:

 - zophola (in *any* position) can be encoded as:
 09CD (B. SIGN VIRAMA) + 09AF (B. LETTER YA)

 And, consequently:

 - zophola_aa can be encoded as:
 09CD (B. SIGN VIRAMA) + 09AF (B. LETTER YA) + 09BE (B. VOWEL SIGN AA)

 - vowel_a_zophola_aa can be encoded as:
 0985 (B. LETTER A) + 09CD (B. SIGN VIRAMA) + 09AF (B. LETTER YA) + 09BE
 (B. VOWEL SIGN AA)

 - vowel_e_zophola_aa can be encoded as:
 098F (B. LETTER E) + 09CD (B. SIGN VIRAMA) + 09AF (B. LETTER YA) + 09BE
 (B. VOWEL SIGN AA)

> The problem with [this] is that the string would have
> to be specifically looked for. [...]

 Problem? Why a problem? The main job of a rendering engine is to look up
the
 glyphs that correspond to strings of one or more characters. Why should
 *this* particular lookup be a problem?

 Abdul says:-

 OK OK I don’t want to argue with you but I need official guidance.

 You must remember that this sequence: Vowel_A Virama Letter_Ya Vowelsign_AA
 is considered a vowel in its own right or at least a single syllable (i.e.
 it has to be recognized as such by Indic rendering). So suppose I want to
 place a Candrabindu on top of it. Do I do a, Vowel_A Virama Letter_Ya
 Vowelsign_AA Candrabindu or a, Vowel_A Candrabindu Virama Letter_Ya
 Vowelsign_AA or something else?

 You see? There will have to be sequences that are considered illegal when
 Indic scripts are concerned. Other wise people will spell one word more
than
 one way. A good example is Devanagari_vowel I. some people using their
 current software have to type it before the consonant rather than after. If
 you said that Vowel_I should be rendered Unconditionally we would be in a
 real mess with regard to alphabetic sorting.



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:02 EDT