Re: Ancient Greek (symbols versus letters and duplicate letters)

From: Kenneth Whistler (
Date: Mon Apr 07 2003 - 17:59:00 EDT

  • Next message: Jim Allan: "Re: Ancient Greek (symbols versus letters and duplicate letters)"

    > Since Koppa has the same modern meaning as the ancient character (letter
    > Koppa and 90), this runs contrary to my understanding of the Unicode
    > philosophy, as I understand it.

    The point Joop was making is that in *modern* Greek usage a
    distinction is made, where the "lightning koppa" appears as a numeral
    for numbering of legal clauses, or the like, but where the "lollipop
    koppa" is the modern rendition of the archaic koppa (which, of course,
    also had a numeric value). It is *this* distinction which led to
    the separate encoding, not a failure to recognize that ultimately
    all the koppas were, in some sense, the "same letter".

    > Is there an established Unicode Greek
    > sorting algorithm?

    See the table for the Unicode Collation Algorithm, UTS #10.
    It isn't an "established Unicode Greek sorting algorithm" per se,
    but it does provide a default ordering for all Greek characters,
    along with all other Unicode characters. Archaic koppa isn't
    dealt with yet in that table, since it is a recent addition
    to Unicode.

    > I can see no reason to have lunate Sigma U+03F9 as a separate codepoint in a
    > font. Unless convincing information is received, I think I'll include a
    > lunate Sigma only as an alternate glyph to the "true" capital Greek Sigma,
    > U+03A3. This should preclude sorting and search problems.

    Probably a good idea.

    Note that the lunate sigma was encoded also for *modern* reasons.
    It is distinguished in modern typography, as Joop indicates.
    And while it can create problems for sorting and searching,
    there already is such a problem in modern Greek (or modern
    Greek representations of Classical or Ancient Greek) because
    of the sigma and final sigma -- both of which are also just
    the "same letter". This is *already* handled in the Unicode
    Collation Algorithm by giving all three flavors of sigma the
    same primary weights:

    03C3 ; [.0CA6.0020.0002.03C3] # GREEK SMALL LETTER SIGMA
    03F2 ; [.0CA6.0020.0004.03F2] # GREEK LUNATE SIGMA SYMBOL; QQK
    03C2 ; [.0CA6.0020.0019.03C2] # GREEK SMALL LETTER FINAL SIGMA; QQK
              ^^^^ ^^^^
              primary tertiary
    The differences in the weights are at the tertiary level, which
    results in same sorting together with same, and these letters
    only being distinguished the way that capital versus small
    letters are distinguished.

    Your problem, as a Classical numismatics specialist, (and the
    same applies in general to paleographers and papyriologists)
    is that you confront many *more* glyphs for the various
    characters than just the very few distinctions that got
    encoded as distinct forms in the Unicode Standard for one
    modern reason or another. The eventual result is not going
    to be the encoding of each distinct glyph of each distinct
    letter as another character in Unicode. Instead, as you
    are doing, you need to lay out a subsidiary variant space
    which you can map to specialized fonts, and associate all
    of those variants with the Greek alphabet (in your case) or
    whatever base set of characters might apply in other
    paleographic traditions.


    > Many thanks,
    > Chris Hopkins

    This archive was generated by hypermail 2.1.5 : Mon Apr 07 2003 - 18:37:12 EDT