RE: No Invisible Character - NBSP at the start of a word

From: Peter Constable (petercon@microsoft.com)
Date: Mon Dec 06 2004 - 11:41:04 CST

  • Next message: Antoine Leca: "Re: Nicest UTF"

    > From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org]
    On
    > Behalf Of Dean Snyder

    > >I would say that pointing
    > >one text with the vowels of another, without regard for discrepencies
    in
    > >character-count, constitutes an abuse of the Hebrew orthography, and
    > >shouldn't be considered "normal" usage that must be supported.
    >
    > Calling ketiv/qere spellings orthographic abuse, abnormal, and not
    worthy
    > of support in Unicode is based on reasoning backwards from the faulty
    > Unicode model for encoded Hebrew, rather than forwards from the Hebrew
    > script to an encoding model.

    I'd agree, except that I wouldn't give a blanket characterization of the
    Unicode encoding for Hebrew as being faulty.

    There is a natural tendency for people familiar with a particular
    language and its associated script to view encoding requirements as tied
    to that language. I really think then when we devise encodings (and, to
    some extent, rendering implementations -- I mention that since that's
    something I work on) we need to abstract the script away from a
    particular language. The reason for this is that the way the script is
    used to write a particular language at a particular point in time is a
    snapshot of one particular usage. Writing changes with time, and there
    is a tendency for scripts to be adopted for use by other languages.

    I also think we need to view encoding as a representation of text
    elements, whatever the linguistic interpretation (or non-interpretation)
    of those text elements. Thus, I agree with Dean:

    > From an encoding point of view, ketiv/qere is NOTHING MORE than
    arbitrary
    > sequences of Hebrew vowels and consonants, and just as Unicode
    supports
    > ANY sequence of Latin vowels and consonants it should have, from the
    very
    > beginning, supported ANY sequence of Hebrew vowels and consonants.

    except that where he says "it should have" I'd say that I've always
    assumed that it does.

    > The
    > problem lies not in the script, the problem lies in the inadequate
    > encoding model adopted for it - and it needs to be fixed. ALL of the
    > Hebrew script must be supported; anything less is simply unacceptable.

    At this point, I would ask that people move from voicing critiques and
    stating inadequacy to making concrete proposals that identify precisely
    what is inadequate and precisely how that can be remedied.

    Peter Constable



    This archive was generated by hypermail 2.1.5 : Mon Dec 06 2004 - 11:42:47 CST