Re: [hebrew] Re: Holam background document

From: Philippe Verdy (
Date: Tue Aug 03 2004 - 09:08:24 CDT

  • Next message: Marcin 'Qrczak' Kowalczyk: "Re: UAX 15 hangul composition"

    From: "John Hudson" <>
    > Philippe Verdy wrote:
    > > All the "glyph string" processing above is out of scope of Unicode....
    > Yes, a renderer could be designed to work in this way, and fonts could be
    designed to work
    > with such a renderer. The issue is not whether rendering systems can be
    made that would
    > support the holam male / vav haluma distinction in this way, but whether
    the distinction
    > can be encoded in such a way that it will work reliably in multiple
    rendering systems
    > using different glyph processing models. I'm much more concerned about
    existing rendering
    > systems than I am about imaginary ones.

    Uniscribe is not an "imaginary" rendering system. It effectively uses
    strings of glyph ids but when you feed a string of characters, this string
    is splitted into *multiple* strings of glyph ids, each one with its own
    context of rendering flags.
    This approach is effectively creating ATTRIBUTED TEXT from the origin
    Unicode plain text. The glyph strings in Uniscribe have absolutely NO USE
    without its context of rendering attributes.

    You think this is acrobatic, yes it is! But there's a need for such
    acrobacies in any string renderer that pretends supporting Arabic ligatures
    and contextual forms, Brahmic letter reordering, or decomposition into
    sub-glyphs that will be reordered differently...

    As I say, all this is part of the job of the renderer, which is the only
    place where glyph ids may be introduced and used, in collaboration with
    fonts and other external data tables, or even under the control of a
    user-specified stylesheet or linguistic context.

    Assume that you need such a renderer to write Arabic (think about mirrored
    characters), or Tamil, or even Han (think about ruby or interlinear
    annotations, and vertical/horizontal presentation...) simply to correctly
    render a plain-text document, then you have already the tools needed to
    support distinctions like between <C1,C2> and <C1,ZWJ,C2> which is easy to
    encode in the plain-text Unicode document.

    The fact that some less advanced renderers will not be able to render the
    distinction SHOULD NOT limit the possibility of using ZWJ/ZWNJ to create
    additional distinctions in the plain-text. In the case of vav-haluma and
    holam-vav, there's really a semantic distinction, which is an excellent
    reason why such distinction should be encodable in the plain text, even if
    some renderers will not be able to render differently....

    This archive was generated by hypermail 2.1.5 : Tue Aug 03 2004 - 09:10:05 CDT