RE: SPAM: About combining classes

From: Jony Rosenne (rosennej@qsm.co.il)
Date: Fri Jun 27 2003 - 09:32:11 EDT

  • Next message: Philippe Verdy: "Re: [cowan: Re: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels)]"

    > -----Original Message-----
    > From: unicode-bounce@unicode.org
    > [mailto:unicode-bounce@unicode.org] On Behalf Of Philippe Verdy
    > Sent: Friday, June 27, 2003 12:31 PM
    > To: unicode@unicode.org
    > Subject: SPAM: About combining classes
    >
    >
    > When I just look at the history of combining classes, they
    > did not exist in the first Unicode standard, and they still
    > don't exist in ISO10646 as well. This was a technology
    > developed by IBM and offered for free to the community to
    > allow a simplified management of encoded texts, and it has
    > long been informative (as well as the proposed normalization
    > forms), before it was recognized it would be useful.
    >
    > However, if there are things that this added property of
    > characters that may break the encoding of languages
    > (including future languages that may be encoded), I think
    > that this creates an opportunity to standardize the use of a
    > specific character that will allow bypassing the constraints
    > added by these now standard combining classes when it is needed.
    >
    > The case of Biblic Hebrew is what will occur in the future
    > because combining classes have been defined to stay here for
    > a long time, as it solves many problems with modern
    > languages. Of course the CGJ character works, but we'll have
    > more pressure in the future to use some bypassing encoding
    > features when this is really needed for any newly encoded text.
    >
    > Without this added character (CGJ for example), all future
    > encoded scripts may simply abandon the idea of assigning
    > non-zero combining classes, despite they would be useful in
    > many cases to detect the *most common* obvious equivalences
    > and simplify the unification of text with the same semantic
    > and graphical rendering.
    >
    > We *must not* come back on the encoding of Hebrew.
    > Traditional Hebrew is definitely a distinct language, the
    > same way that for Old Greek, or Old Hungarian, or the various
    > regional forms of languages written historically with many
    > variants of diacritics on Latin letters.

    I beg to disagree. It is the same language. The examples such as Greek are
    different. Our young pupils regularly read the original biblical texts.

    The problem is that in the Bible there are a number of "irregular"
    combinations, such as a small number of cases of Meteg on the other side,
    the Hiriq in Yerushala(y)im, etc. While I appreciate their importance for
    Biblical scholars, and of course a solution must be found, it is critical
    that any solution should not disrupt the regular use of Hebrew.

    In the past, my suggestion that some of these are not plain text issues and
    that they should be relegated to a higher level protocol was not accepted.
    The current discussion causes me to bring it up once again.

    I am under the impression that the existing scientific encodings of the
    Bible are encode with the help of some kind of mark up, and maybe this is
    how they should continue.

    Jony



    This archive was generated by hypermail 2.1.5 : Fri Jun 27 2003 - 09:12:34 EDT