Re: Yerushala(y)im - or Biblical Hebrew

From: Peter Kirk (peter.r.kirk@ntlworld.com)
Date: Wed Jul 23 2003 - 08:13:54 EDT

Next message: Alan Wood: "RE: U+23D0 VERTICAL LINE EXTENSION"

Previous message: John Cowan: "Re: U+23D0 VERTICAL LINE EXTENSION"
Maybe in reply to: Philippe Verdy: "Re: Yerushala(y)im - or Biblical Hebrew"
Next in thread: Kenneth Whistler: "Re: Yerushala(y)im - or Biblical Hebrew"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

On 23/07/2003 03:20, Paul Nelson (TYPOGRAPHY) wrote:

>Please look at the definition of GCJ and other such characters.
>Understand the differences between CGJ and ZWJ/ZWNJ.
>
>This discussion is very disturbing to me because after reading through
>the L2 document register it is unclear what is the difference between
>GCJ and ZWJ use.
>
>The fact that you desire a control character to not be treated as such
>greatly concerns me. This really feels like people are trying to figure
>out any way to twist existing constructs to avoid fixing the
>normalization weights. I am alarmed from the implications of putting
>control characters in place to somehow subvert the normalization.
>
>In an ideal world we would simply correct these values. However, it has
>been strongly communicated by the UTC that this cannot be done without
>jeoparizing stability agreements with IETF. Peter Constable has posted a
>document in the register on this topic that suggests a duplication of
>characters as a solution.
>
>Can we please have this topic put on the agenda for the next meeting of
>the UTC?
>
>Regards,
>
>Paul
>
>
>
>
>
I have been doing a little research into the defined properties of CGJ.
I note also that according to
http://www.unicode.org/book/preview/ch03.pdf it is defined in Unicode
4.0 as a "Default Ignorable". Well, I am not surprised that some people
are confused because
http://www.unicode.org/Public/4.0-Update/UCD-4.0.0.html#Default_Ignorable_Code_Point
tells me "For more information, see UAX #29: Text Boundaries
<http://www.unicode.org/reports/tr29/>.", but the string "ignorable" is
not found in UAX #29. But from a Google search I found
http://www.unicode.org/review/pr-5.html, desribed as "/text excerpted
from the Unicode Standard/", section number 5.22 given so I suppose this
is from the unpublished chapter 5 of Unicode 4.0. According to this,
"Default ignorable code points are those that should be ignored by
default in rendering (unless explicitly supported)... An implementation
should ignore default ignorable characters in rendering whenever it does
/not/ support the characters." So my suggestion that a renderer should
simply ignore CGJ is far from twisting the requirements of Unicode, it
is in fact a requirement of Unicode 4.0 though one that I am hardly
surprised that some people have missed.

The internal process by which a particular renderer implements ignoring
a glyph is a matter for a particular implementation. John Hudson and I
have suggested a mechanism for doing this with Uniscribe by treating the
character internally as a normal character with a blank glyph and always
ligating it with the preceding character. There may be other mechanisms
which are cleaner. But in any case it seems to be a requirement not just
for fixing this Hebrew problem but for conformance with Unicode as a
whole that some such mechanism is implemented, so that CGJ is ignored by
the renderer unless some specific behaviour is defined. In the case of
rendering Hebrew, there seems to be no pressing need to define specific
behaviour as the default is at least close to what is required.

-- 
Peter Kirk
peter.r.kirk@ntlworld.com
http://web.onetel.net.uk/~peterkirk/

Next message: Alan Wood: "RE: U+23D0 VERTICAL LINE EXTENSION"
Previous message: John Cowan: "Re: U+23D0 VERTICAL LINE EXTENSION"
Maybe in reply to: Philippe Verdy: "Re: Yerushala(y)im - or Biblical Hebrew"
Next in thread: Kenneth Whistler: "Re: Yerushala(y)im - or Biblical Hebrew"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Wed Jul 23 2003 - 09:04:53 EDT