RE: No Invisible Character - NBSP at the start of a word

From: Peter Constable (petercon@microsoft.com)
Date: Mon Dec 06 2004 - 11:41:04 CST

Next message: Antoine Leca: "Re: Nicest UTF"

Previous message: Johannes Bergerhausen: "Arial Unicode MS"
Next in thread: Peter Kirk: "Re: No Invisible Character - NBSP at the start of a word"
Reply: Peter Kirk: "Re: No Invisible Character - NBSP at the start of a word"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

> From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org]
On
> Behalf Of Dean Snyder

> >I would say that pointing
> >one text with the vowels of another, without regard for discrepencies
in
> >character-count, constitutes an abuse of the Hebrew orthography, and
> >shouldn't be considered "normal" usage that must be supported.
>
> Calling ketiv/qere spellings orthographic abuse, abnormal, and not
worthy
> of support in Unicode is based on reasoning backwards from the faulty
> Unicode model for encoded Hebrew, rather than forwards from the Hebrew
> script to an encoding model.

I'd agree, except that I wouldn't give a blanket characterization of the
Unicode encoding for Hebrew as being faulty.

There is a natural tendency for people familiar with a particular
language and its associated script to view encoding requirements as tied
to that language. I really think then when we devise encodings (and, to
some extent, rendering implementations -- I mention that since that's
something I work on) we need to abstract the script away from a
particular language. The reason for this is that the way the script is
used to write a particular language at a particular point in time is a
snapshot of one particular usage. Writing changes with time, and there
is a tendency for scripts to be adopted for use by other languages.

I also think we need to view encoding as a representation of text
elements, whatever the linguistic interpretation (or non-interpretation)
of those text elements. Thus, I agree with Dean:

> From an encoding point of view, ketiv/qere is NOTHING MORE than
arbitrary
> sequences of Hebrew vowels and consonants, and just as Unicode
supports
> ANY sequence of Latin vowels and consonants it should have, from the
very
> beginning, supported ANY sequence of Hebrew vowels and consonants.

except that where he says "it should have" I'd say that I've always
assumed that it does.

> The
> problem lies not in the script, the problem lies in the inadequate
> encoding model adopted for it - and it needs to be fixed. ALL of the
> Hebrew script must be supported; anything less is simply unacceptable.

At this point, I would ask that people move from voicing critiques and
stating inadequacy to making concrete proposals that identify precisely
what is inadequate and precisely how that can be remedied.

Peter Constable

Next message: Antoine Leca: "Re: Nicest UTF"
Previous message: Johannes Bergerhausen: "Arial Unicode MS"
Next in thread: Peter Kirk: "Re: No Invisible Character - NBSP at the start of a word"
Reply: Peter Kirk: "Re: No Invisible Character - NBSP at the start of a word"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Mon Dec 06 2004 - 11:42:47 CST