Re: Unicode 5.0 decompositions of Balinese vowel signs with tedung

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Fri Apr 14 2006 - 10:34:09 CST

  • Next message: Philippe Verdy: "Re: Unicode 5.0 decompositions of Balinese vowel signs with tedung"

    You have excluded this one in your list:

    * U+1B0E = <U+1B0D ; U+1B35>
      BALINESE LETTER LA LENGA TEDUNG (vocalic ll) =
      BALINESE LETTER LA LENGA (vocalic l) +
      BALINESE VOWEL SIGN TEDUNG (aa)

    My opinion is that this is part of the set, even if the tedung takes a constextual ligatured form.
    One could still want the non ligatured form by explictly coding <U+1B0E,ZWNJ,U+1B35>. It would still be read as a LA LENGA with TEDUNG, even if the TEDUNG is not ligatured.

    The charts show the ligatured form of this tedung and so suggests that this is the prefered form, but it dow not change the fact that this is a ligature and not different from a LA LENGA and a normal TEDUNG joined on the right.

    And the balinese name of U+1B0E is clear: it gives the interpretation for native Balineses and they may be confused by the fact that, without this canonical decomposition, the "LETTER LA LENGA TEDUNG" (vocalic long l) will be considered different from "LETTER LA LENGA" (vocalic l) followed by a "VOWEL SIGN TEDUNG" (long vowel mark). Are there reasons to keep these two sequences distint?

    ----- Original Message -----
    From: "Peter Constable" <petercon@microsoft.com>
    To: <unicode@unicode.org>; <unicore@unicode.org>
    Sent: Friday, April 14, 2006 6:07 AM
    Subject: RE: Unicode 5.0 decompositions of Balinese vowel signs with tedung

    > Philippe has found a bug: the minutes of Mtg 103 make clear (103-C10) that the properties for Balinese characters were to be as specified in L2/05-090, and those have canonical decompositions for these multi-part vowels.
    >
    > I've checked the UnicodeData.txt properties for Balinese, and the decomposition mappings are the only ones with errors. Here are the corrected entries for the affected characters:
    >
    > 1B06;BALINESE LETTER AKARA TEDUNG;Lo;0;L;1B05 1B35;;;;N;;aa;;;
    > 1B08;BALINESE LETTER IKARA TEDUNG;Lo;0;L;1B07 1B35;;;;N;;ii;;;
    > 1B0A;BALINESE LETTER UKARA TEDUNG;Lo;0;L;1B09 1B35;;;;N;;uu;;;
    > 1B0C;BALINESE LETTER RA REPA TEDUNG;Lo;0;L;1B0B 1B35;;;;N;;vocalic rr;;;
    > 1B12;BALINESE LETTER OKARA TEDUNG;Lo;0;L;1B11 1B35;;;;N;;au;;;
    > 1B3B;BALINESE VOWEL SIGN RA REPA TEDUNG;Mc;0;L;1B3A 1B35;;;;N;;vocalic rr;;;
    > 1B3D;BALINESE VOWEL SIGN LA LENGA TEDUNG;Mc;0;L;1B3C 1B35;;;;N;;vocalic ll;;;
    > 1B40;BALINESE VOWEL SIGN TALING TEDUNG;Mc;0;L;1B3E 1B35;;;;N;;o;;;
    > 1B41;BALINESE VOWEL SIGN TALING REPA TEDUNG;Mc;0;L;1B3F 1B35;;;;N;;au;;;
    > 1B43;BALINESE VOWEL SIGN PEPET TEDUNG;Mc;0;L;1B42 1B35;;;;N;;;;;
    >
    >
    >
    > Peter Constable
    >
    >
    >
    >> -----Original Message-----
    >> From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org] On
    >> Behalf Of Philippe Verdy
    >> Sent: Thursday, April 13, 2006 12:15 AM
    >> To: unicode@unicode.org
    >> Subject: Unicode 5.0 decompositions of Balinese vowel signs with tedung
    >>
    >> I note that the Unicode 5.0 BETA charts (d1) do not indicate any canonical
    >> decomposition for composite vowel signs, like they exist in Devanagari,
    >> especially for those that have a right-hand part. Won't that cause
    >> difficulties in implementations?
    >>
    >> Shouldn't they be given canonical equivalents (also reflected also in
    >> their balinese names ?) This would also avoid two possible confusable
    >> encodings (for IDN or other similar apps), given that they will be
    >> rendered the same, and may be understood idetically by people, including
    >> when composing texts on keyboards where these complex signs may be entered
    >> in a decomposed way, instead ofa single keystroke for the composite ; this
    >> would probably simplify the design of Balinese keyboards for those long
    >> vowels, given that this would avoid requiring more separate positions only
    >> for them, and the fact they could be entered equivalently in a decomposed
    >> way using only "short" vowels. An advanced keyboard or an editor may
    >> recompose them on the fly using simply the canonical decompositions
    >>
    >> This concerns the following five vowel signs:
    >> * U+1B3B = <U+1B3A ; U+1B35>
    >> BALINESE VOWEL SIGN RA REPA TEDUNG (vocalic rr) =
    >> BALINESE VOWEL SIGN RA REPA (vocalic r) +
    >> BALINESE VOWEL SIGN TEDUNG (aa)
    >> * U+1B3D = <U+1B3C ; U+1B35>
    >> BALINESE VOWEL SIGN LA LENGA TEDUNG (vocalic ll) =
    >> BALINESE VOWEL SIGN LA LENGA (vocalic l) +
    >> BALINESE VOWEL SIGN TEDUNG (aa)
    >> * U+1B40 = <U+1B3E ; U+1B35>
    >> BALINESE VOWEL SIGN TALING TEDUNG (o) =
    >> BALINESE VOWEL SIGN TALING (e) +
    >> BALINESE VOWEL SIGN TEDUNG (aa)
    >> * U+1B41 = <U+1B3E ; U+1B35>
    >> BALINESE VOWEL SIGN TALING REPA TEDUNG (au) =
    >> BALINESE VOWEL SIGN TALING REPA (ai) +
    >> BALINESE VOWEL SIGN TEDUNG (aa)
    >> * U+1B43 = <U+1B42 ; U+1B35>
    >> BALINESE VOWEL SIGN TALING PEPET TEDUNG (oe) =
    >> BALINESE VOWEL SIGN TALING PEPET (ae) +
    >> BALINESE VOWEL SIGN TEDUNG (aa)
    >>
    >> Philippe.
    >>
    >>
    >>
    >
    >
    >
    >
    > ---------------------------------------------------------------------------------------
    > Wanadoo vous informe que cet e-mail a ete controle par l'anti-virus mail.
    > Aucun virus connu a ce jour par nos services n'a ete detecte.
    >
    >
    >



    This archive was generated by hypermail 2.1.5 : Fri Apr 14 2006 - 10:35:26 CST