RE: Lunate, Terminal, and Medial Sigma

From: Jim Allan (jallan@smrtytrek.com)
Date: Sun Nov 10 2002 - 14:06:51 EST

  • Next message: Marco Cimarosti: "RE: Is long s a presentation form?"

    Carl W. Brown posted:

    > There already is a Unicode solution for the problem. Check UAX #21.
    > If search engines use case insensitive compares then it should be no
    > problem.

    Yes, if only Google and other search engines would implement at least
    the minumum recommended foldings in
    http://www.unicode.org/Public/UNIDATA/CaseFolding.txt. Currently Google
    does not even equate /ß/ with /ss/.

    > There are a lot of exceptions to the rule so that you need separate
    > characters for the forms but you also need an algorithm that works
    > reasonable well for most cases.
    >
    > "Character (final sigma) is preceded by a sequence consisting of a
    > cased letter and a case-ignorable sequence, and character is not
    > followed by a sequence consisting of an ignorable sequence and then a
    > cased letter."

    I totally agree with this.

    My original post was directed against the argument that final sigma and
    non-final sigma should have been merged as a single character to be
    displayed as either non-final sigma, final sigma, or lunate sigma
    according to a higher protocol (e.g. an intelligent font).

    If this route had been taken, then one would still require some method
    to indicate exceptions, either proprietary triggers in a font or other
    higher software or an overriding variation selection character at the
    plain text level. It is not clear that any greater ease would have been
    gained by moving the necessary variation selection from one level of
    representation to another.

    Jim Allan



    This archive was generated by hypermail 2.1.5 : Sun Nov 10 2002 - 15:21:47 EST