Re: TLG and Beta code

From: Nick Nicholas (
Date: Wed Aug 27 2003 - 07:33:16 EDT

  • Next message: Magda Danish \(Unicode\): "Major Enhancements to the Unicode Standard: Enabling International Domain Names, Expanding Worldwide Accessibility, and Reducing the Digital Divide"

    On Wednesday, Aug 27, 2003, at 18:55 Australia/Melbourne, Ecartis wrote:

    > From: "Raymond Mercier" <>
    > Subject: TLG and Beta code
    > Date: Wed, 27 Aug 2003 09:49:20 +0100
    > David,
    > I am glad to see this much progress, yet, as I noticed after posting,
    > the zero symbol is actually missing in
    > beta code, so your Beta code -Unicode equivalences would not have it.
    > I think it is fair to say that the TLG have avoided the parts of
    > mathematical texts where the symbol is common, as in the various
    > tables in Ptolemy's Almagest (where all the tables are omitted by
    > TLG). This symbol is in reality more common than the rarities listed
    > in quickbeta. In the editions I am involved with we use U+14D, o,
    > which is near enough I suppose.

    I count 368 instances of #130, the TLG entity for Greek zero, in the
    text of the Almagest the TLG has, and a further 543 in Pappus'
    commentary on the Almagest, 80 in Theon's commentary, and well over a
    thousand in Byzantine astronomers; so rumours of its absence in Beta
    code are exaggerated. :-) The TLG didn't actually avoid the tables (at
    least not those integrated into the text), though the current markup of
    the tables is somewhat dated.

    Of course, the scholarly markup of texts in general raises the question
    of when a glyph does need a Unicode codepoint, and when it is merely a
    variant of something else, or beyond the scope of plaintext. The
    listing of Beta escapes includes much that is either idiosyncratic or
    a variant of something else; the TLG has traditionally erred on the
    side of caution in including Beta escapes (equivalent to XML entities),
    but the requirements for TLG markup are not necessarily the same for
    inclusion in Unicode.

    The equivalent glyph the TLG has posted for #130 is omicron, though of
    course the print edition used for the Almagest has its Greek zero
    slightly different (it's closer to an Goudy-style Arabic zero, from
    memory.) Whether it merits its own codepoint, or is merely a glyph
    variant of U+0030 Digit Zero, is probably a debate for another time and
    place. What to do with such "one-off" glyphs the kind of issue the
    Text Encoding Initiative is having to deal with, though.

    One might argue against omicron or o-macron for Greek Zero on the
    grounds that this isn't really a character but a digit; but then these
    texts use letters for digits anyway. So I don't see a clear rationale
    for one way or the other. However, I think the numerical diacritic for
    the zero should be the same as for other Greek letters, and it should
    be U+0305 Combining Overline rather than U+0304 Combining Macron.

    "Assuming, for whatever reasons, that neither scholar presented the
      evidence properly, then there remains a body of evidence you have not
      yet destroyed because it has never been presented." --- Harold Fleming

    This archive was generated by hypermail 2.1.5 : Wed Aug 27 2003 - 08:21:14 EDT