Re: Yerushala(y)im - or Biblical Hebrew

From: Peter Kirk (
Date: Wed Jul 23 2003 - 05:51:40 EDT

    On 22/07/2003 20:49, John Hudson wrote:

    > Thinking about this whole Hebrew encoding/normalisation problem from
    > the rendering side -- i.e. in terms of smart font glyph substitution
    > and mark positioning -- it seems me that *if* a character is to be
    > inserted between two vowels that visually follow a single consonant,
    > it would be preferable if this were not a control character. If the
    > character is something that is actually painted as a glyph, it is
    > incredibly easy to resolve the rendering problems in the font's
    > composition/decomposition <ccmp> feature by, for example, ignoring
    > marks while 'ligating' the consonant + extra character to the
    > consonant glyph alone. This would happen before the mark positioning
    > takes place, so would ensure that the presence of the extra character
    > does not break mark positioning in the way that actual control
    > characters seem to (and which cannot be be included in <ccmp> lookups
    > because they are not painted).
    > Of course, if this problem is thought about from the text search,
    > sort, etc. perspective, the presence of a non-control character --
    > even an invisible one -- introduces problems, since it changes the
    > encoding of the word. As noted previously, users are unlikely to know
    > that they must insert an invisible character into search strings. My
    > question is whether this is something that can be addressed in search
    > engines and other places affected, so that this character could be
    > filtered out during operations? Or, failing that, if we might here
    > have a use for a new zero-width character that will be ignored during
    > search and sort operations --- as control characters like ZWNJ and CGJ
    > are -- but which will be painted -- as control characters typically
    > are not?
    > John Hudson
    > Tiro Typeworks
    > Vancouver, BC
    Couldn't the rendering engine simply treat CGJ as a non-control
    character with a blank and zero width glyph? Then it could ligate it in
    the way you suggest. As far as I can tell there is no other known use
    for CGJ and therefore no need for a rendering engine to process it as a
    control character. This behaviour of CGJ can presumably be made
    independent of its behaviour in searching and sorting.

    Peter Kirk

