RE: Proposal to add four characters for Kashmiri to the BMP of the UCS

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Sun Jul 06 2008 - 03:47:48 CDT

  • Next message: Doug Ewell: "WGL4 membership"

    Christopher Fynn wrote:
    > N3480 Proposal to add four characters for Kashmiri to the BMP
    > of the UCS <http://std.dkuug.dk/jtc1/sc2/wg2/docs/n3480.pdf>
    >
    > I'm not sure why you need the proposed precomposed
    > "DEVANAGARI LETTER UE" and "DEVANAGARI LETTER UUE"
    >
    > Since these two could be easily represented by combining
    > U+0905 DEVANAGARI LETTER A with the proposed
    > "DEVANAGARI VOWEL SIGN UE" and "DEVANAGARI VOWEL SIGN UUE"

    You have probably not read the justifications given in the proposal. UE is a
    single short vowel, made distinct from short vowel U; UUE is the associated
    long vowel.

    They are certainly not letter A + one of the two vowel signs. The purpose is
    to have them treated exactly like other Devanagari independant vowels that
    are coded also without decomposing them into a base vowel A plus another
    vowel sign. The independant letter A does not behave in Devanagari like
    other consonant letters: its "implicit" vowel is not overridable by a vowel
    sign to create another independant letter, but each vowel creates its own
    base letter. This model is preserved in the proposal.

    In fact, if you look at it more closely, the proposed UE (ü, a variant of u)
    and UUE (û or long ü) in Devanagari are taking the fact that they has been
    for now encoded by borrowing the Gurmukhi vowel signs for U and UU, that
    also had their own independant vowel forms (and there two, the independant
    vowels are not encoded using independant Gurmukhi letter A plus a vowel
    sign).

    Why is it proposed for encoding? That's most probably because the
    transcription between Indic scripts cause problems, as such borrowing pose a
    problem for the transciprtion of the Gurmukhi vowels: if you consider the
    romanization of the Gurmukhi vowel signs, they would translate as U and UU
    instead of the expected UE and UUE, so you would loose the distinction made
    in Kashmiri (and preserved when written in Devenagari) between U and UE, or
    between UU and UUE.

    Why do you want to break the existing model in Devanagari by trying to
    decompose base letters for independant vowels, into another independant
    vowel A plus a vowel sign modifier? This has not been made in Devanagari for
    all other independant vowels.

    It really looks like the addition of distinction between U and UE, or
    between UU and UUE, is essential for Kashmiri even if it is not needed for
    Hindi or Sanskrit. So effectively Kashmiri needs the two other vowels signs
    and, for consistency of the Devanagari script, also the associated
    independant vowels letters.

    Your suggestion would be valid, if all other Devanagari independant vowels
    where treated as being like if they were in fact composed with a base
    "consonnant" letter A (the unpronounced/missing consonnant plus the implicit
    vowel A) plus an optional vowel sign. This was not done in Devanagari: those
    independant vowels other than A are not decomposed. There's no reason to
    decompose them for the case of the Kashmiri variants.

    That's the way I understand it. The proposal is preserving the consistancy.
    In fact I would not like to see independant letters UE and UUE decomposed
    the way you propose using letter A+vowel sign: you are loosing the fact that
    these independant letters are in fact variants of independant U and UU
    letters.

    It would probably be better to use the existing letters U and UU with a
    visarga for denoting these variants, but I'm quite sure that there exsts
    cases where visargas are used in Sanskrit (or in other languages written
    with Devanagari) that do not mean that they are creating variants of the
    vowel, but instead variants of the base consonnant of the akshara. The kind
    of modification is also not a nasalisation (so an anusvara can't be used to
    note these phonetic vowel variants, and in fact the Kashmiri vowels can also
    be used with or without nasalisation, meaning that anusvara must remain
    usable separately with them).

    You could have also proposed to not encode the long vowels given that they
    "look" exactly like pairs of short vowels: it would have been enough to add
    another UE vowel sign after encoding the first UE vowel sign or independant
    letter UE. But here also this would contradict the encoding model for the
    rest of the Devanagari script (and of other Indic scripts as well).

    For this reason, I don't see any defect in the proposal, and also think
    that, under the given justificiations, FOUR characters need to be encoded,
    and not just two or three. It is interesting also to read the introduction
    to the Devanagari script in TUS (since main version 2.0 and up to current
    version 5.0 of the book).



    This archive was generated by hypermail 2.1.5 : Sun Jul 06 2008 - 09:14:18 CDT