Re: Biblical Hebrew (U+034F Combining Grapheme Joiner works)

From: John Hudson (tiro@tiro.com)
Date: Sat Jun 28 2003 - 02:47:57 EDT

  • Next message: Philippe Verdy: "Re: Biblical Hebrew (U+034F Combining Grapheme Joiner works)"

    At 07:10 PM 6/27/2003, Kenneth Whistler wrote:

    >Why? The point is that:
    >
    > <patah, CGJ, hiriq>
    >
    >is one thing, and
    >
    > <hiriq, CGJ, patah>
    >
    >is another. You *want* those sequences to be distinct, right? Even
    >if the text has been normalized, right? That was the whole
    >problem with:
    >
    > <patah, hiriq>
    > <hiriq, patah>
    >
    >which are canonically equivalent, since they both normalize to:
    >
    > <hiriq, patah>
    >
    >So the CGJ *is* significant for searching (and sorting). If you
    >want one sequence, you search for <patah, CGJ, hiriq>, if you
    >want the other, you search for <hiriq, CGJ, patah>. If you
    >don't care, and want to find either, *then* you strip out the
    >CGJ and normalize before comparison.

    I think Peter's point may be that scholar searching for patah followed by
    hiriq are most likely to search for <patah, hiriq>, and frankly who can
    blame them? This is what they see in the printed text, and it is what,
    hopefully, they would be able to input. So again we're looking at a
    solution that is only as attractive as the ability to hide it from users.

    I am working on some exhaustive documentation of the normalisation problems
    affecting Hebrew mark ordering, which will ensure that we have a good grasp
    of the extent of the problem and a clear view of all the permutations that
    need to be taken into account by any solution.

    John Hudson

    Tiro Typeworks www.tiro.com
    Vancouver, BC tiro@tiro.com

    If you browse in the shelves that, in American bookstores,
    are labeled New Age, you can find there even Saint Augustine,
    who, as far as I know, was not a fascist. But combining Saint
    Augustine and Stonehenge -- that is a symptom of Ur-Fascism.
                                                                 - Umberto Eco



    This archive was generated by hypermail 2.1.5 : Sat Jun 28 2003 - 03:34:53 EDT