Re: Errors in TUS Figure 15.2?

From: Antoine Leca (
Date: Mon Aug 02 2004 - 03:25:19 CDT

  • Next message: "ENV 13710 / European Ordering Rules"

    On Friday, July 30th, 2004 19:47, Peter Kirk va escriure:
    >>> There appear to be two errors (not listed in the errata page
    >>> in Figure 15.2 on page 391 of The
    >>> Unicode Standard 4.0, the online version at
    >>> The fourth column is supposed to indicate the desired rendering of
    >>> <C1, ZWJ, C2>. But in the text just before, ZWJ is specified as

    Otto answered:
    >> Read the paragraph immediately below that figure.
    > OK. I did. But I shouldn't have to do that as this figure is supposed
    > to be an example of what has been specified before.

    Then have a look at Unicode 3.0.1
    <URL:> and you will
    understand what did happen: there was initially the way you expected; but
    then (I cannot spot exactly when, but it should be possible to find this),
    for backward consideration, this very behaviour (requesting ligatures) was
    defeated for Arabic only. As a result, the table was updated, and now is
    about useless. We really should provide examples from others scripts (Khmer
    perhaps; and Sinhala, which appears to behave exactly this way according to
    SLS 1134, the Ceylanese standard)

    > And there is still a problem with the text before the figure.

    Which text?

    I was noticing a problem, but it is not what you are pointing out.
    Page 390 has a section which describes the behaviour of ZWJ. This text is
    where it is written that ZWJ would request a ligature (an useful addition
    here would be to signal that Arabic on one side, and scripts of India on
    another, are exceptional on this respect). Then, if ligature is not
    available ("otherwise"), it explains the function of ZWJ to request
    cursively connection form.

        . Otherwise, if either of the characters could cursively connect
          but do not normally, ZWJ requests that each of the characters
          take a cursive-connection form where possible.

        In a sequence like <X, ZWJ, Y>, where a cursive form exists for X
        but not for Y, the presence of ZWJ requests a cursive form for X.

    Till there, I have no problem. But then, I would have expected the obvious
    reversed case, where a cursive form exists for Y but not for X (and the
    function of ZWJ would be to request the/a cursive form for Y). However,
    there is no such text... what is written is:

        Otherwise, where neither a ligature nor a cursive connection is
        available, the ZWJ has no effect.

    I believe this is a formal defect that ought to be corrected. Particularly
    when Sinhala uses this feature quite a lot (for rakaransaya and yansaya),
    and also since it is described a few lines below, with the example from the
    Persian standard!

    Rest of the section reads:

        In other words, given three broad categories below, ZWJ requests
        that glyphs in the highest available category (for the given font)
        be used:

        1. Unconnected

        2. Cursively connected

        3. Ligated


    This archive was generated by hypermail 2.1.5 : Mon Aug 02 2004 - 03:30:35 CDT