Re: Proposal to add four characters for Kashmiri to the BMP of the UCS

From: Kenneth Whistler (
Date: Mon Jul 07 2008 - 14:05:27 CDT

  • Next message: John Hudson: "Normalisation and directionality (was: how to add all latin (and greek) subscripts)"

    First of all, I will state up front that I have no objection
    to the proposal as written -- it seems justified given the information
    about the recent Kashmiri orthography reform.

    > I suspect that most of the pre-composd isolate vowels were included for
    > backwards compatibility with a pre-existing standard(s) like ISCII -

    Yes, but not just for that reason.

    > IMO there is no good reason to add additional pre-composed characters
    > when a base character + combining mark will work fine particularly when
    > these characters are for a what seems to be pretty well a brand-new
    > orthography.

    I disagree in this case. Devanagari works differently (for its
    Unicode encoding) than Tibetan does.

    U+0972 DEVANAGARI LETTER CANDRA A was added as recently as Unicode 5.1
    (and not decomposed). We went through the same set of arguments then,
    and I don't see the value of hashing through it every time another
    example comes up.

    > Preserving consistency could be used the next time someone wants to add
    > more pre-composed Latin chars.

    No, because precomposed Latin characters have canonical decompositions.
    Devanagari (and most other Indic) independent matras do not.

    > I thought there was a policy not to add more pre-composed characters. Is
    > this not the case?

    It generally *is* the case. But what that means is that characters
    will not be encoded if by precedent characters of that type have
    *canonical* decompositions to already encoded pieces.

    It doesn't mean that there is an absolute proscription against
    encoding complex graphic entities as characters.


    This archive was generated by hypermail 2.1.5 : Mon Jul 07 2008 - 14:08:37 CDT