From: Jony Rosenne (rosennej@qsm.co.il)
Date: Wed Jul 30 2003 - 15:29:12 EDT
Problem:
We have here one character sequence with two alternate renditions: the
common rendition, in which they are the same, and a distinguished rendition
which uses two separate glyphs for the separate meanings.
On paper, which is two-dimensional, it is a Vav with a Holam point somewhere
above it. Unicode decided that in the encoding, which is one-dimensional,
the marks follow the base character.
Any solution should accommodate both kinds of users and both renditions.
Solution: Suggestions, please.
Jony
> -----Original Message-----
> From: unicode-bounce@unicode.org
> [mailto:unicode-bounce@unicode.org] On Behalf Of Ted Hopp
> Sent: Wednesday, July 30, 2003 6:43 PM
> To: unicode@unicode.org; Joan_Wardell@sil.org
> Subject: SPAM: Re: Back to Hebrew -holem-waw vs waw-holem
>
>
> On Wednesday, July 30, 2003 11:57 AM, Joan_Wardell@sil.org wrote:
> > I agree 100% with your description of the characters that have not
> > been encoded in Unicode. There are certainly marks and
> consonants that
> > mean two completely different things, as you have so accurately
> > described. But
> there
> > are two approaches to encoding. There is "Code what you
> see" and "Code
> what
> > is meant". In your analysis and in the way SIL encoded the original
> > SIL Ezra font, we went with "Code what is meant". This
> means that we
> > have two shevas (one pronounced and one silent), a holemwaw
> character
> > and a shureq character. Unicode, on the other hand, is
> totally "Code
> > what you see". It is attempting to make no analysis of the marks on
> > the page. If there is a mark, code it. If it is identical
> to another
> > mark, then it gets the same codepoint. (Of course, there are
> > exceptions, but this is the general
> rule.)
>
> One of the key points some of us are trying to make is that
> vav with kholam khaser is a different mark on the page than a
> kholam male. Different semantics AND different appearance,
> but no separate Unicode encoding. What more do we need?
>
> Besides, what's all this that I keep reading about Unicode
> encodes characters, not glyphs? From Chapter 1: "[T]he
> standard defines how characters are interpreted, not how
> glyphs are rendered." The "code what you see" approach, while
> probably the reality of Unicode, seems somewhat contrary to
> this statement of principle.
>
> > So with Unicode, there is no way to separate even vowels and
> > consonants, since a waw in a shureq, a holem-waw, and just
> a plain waw
> > will always be encoded the same. Some of us are trying to make this
> > approach usable by allowing at least a holem-waw to be
> distinguished
> > from waw holem, by placing the holem first.
> >
> > For the encoders, it is fairly straight-forward. For the
> people trying
> > to actually use the encoding, it's going to take a lot of context to
> determine
> > what you've got.
>
> Yes, indeed. Nothing like an encoding that can't be decoded. :)
>
> Ted
>
> Ted Hopp, Ph.D.
> ZigZag, Inc.
> ted@newSLATE.com
> +1-301-990-7453
>
> newSLATE is your personal learning workspace
> ...on the web at http://www.newSLATE.com/
>
>
>
>
>
This archive was generated by hypermail 2.1.5 : Wed Jul 30 2003 - 15:41:05 EDT