Re: Major Defect in Combining Classes of Tibetan Vowels

From: Kenneth Whistler (kenw@sybase.com)
Date: Wed Jun 25 2003 - 18:29:59 EDT

  • Next message: John Hudson: "Re: Major Defect in Combining Classes of Tibetan Vowels"

    John Hudson wrote:

    > In Biblical Hebrew, it is possible for more than one vowel to be attached
    > to a single consonant. This means that is it very important to maintain the
    > ordering of vowels applied to a single consonant. The Unicode Standard
    > assigns an individual combining class to every vowel, meaning that NFC
    > normalisation may re-order vowels on a consonant.

    This is true.

    > This is not simply
    > 'non-traditional' but results in incorrect rendering and a different
    > vocalisation of the text.

    I don't think this is true.

    First, the intent of the (admittedly problematical) fixed position
    combining classes was that the position of the relevant marks,
    including the relevant Hebrew points, was fixed with respect to
    the consonant base letter, so that application of one would not
    impact the rendering of application of another. Unlike the
    generic above and below combining classes, the general inside-out
    positioning rule would not apply to sequences of fixed position
    marks.

    It may be more *difficult* for applications to do correct rendering,
    but there was never any intention in the standard that I know
    of that a sequence <hiriq, patah> would render differently
    than a sequence <patah, hiriq>. And never any intent that it
    would represent a "different vocalisation of the text".

    > The point is that hiriq before patah is *not*
    > canonically equivalent to patah before hiriq,

    This is true.

    > except in the erroneous
    > assumption of the Unicode Standard: the order of vowels makes words sound
    > different and mean different things.

    This is not. The Unicode Standard makes no assumptions or claims
    about what the phonological or meaning equivalence of <hiriq, patah>
    or <patah, hiriq> is for Biblical Hebrew.

    The fact that traditional Biblical Hebrew spelling prefers one
    order of representation and canonically ordered Unicode text
    specifies the opposite order may be a problem for implementations,
    but that problem does not extend to the claims that John is
    making here.

    >
    > In order to correctly encode and render the Biblical Hebrew text, it is
    > necessary to either a) never use normalisation routines that re-order marks
    > (which is beyond the control of document authors), or b) re-classify the
    > existing Hebrew marks so that all vowels are in a single class and will not
    > be re-ordered during normalisation, or c) encode new marks for Biblical
    > Hebrew with all vowels in a single class.

    I don't think these conclusions following from the current
    situation.

    Such changes are certainly not necessary in order to *render*
    Biblical Hebrew text correctly, nor to accurately represent
    the content of Biblical Hebrew text.

    The current situation is not optimal for implementations, nor
    does canonically ordered text follow traditional preferences
    for spelling order -- that we can agree on. But I think the
    claims of inadequacy for the representation or rendering
    of Biblical Hebrew text are overblown.

    --Ken



    This archive was generated by hypermail 2.1.5 : Wed Jun 25 2003 - 20:22:17 EDT