Re: GCGID for U+03B8

From: Doug Ewell (dewell@adelphia.net)
Date: Sat Oct 12 2002 - 16:57:30 EDT

  • Next message: Tex Texin: "Re: Origin of the term i18n/top 10 list"

    Asmus Freytag <asmusf at ix dot netcom dot com> wrote:

    >> IBM has a Web page containing many PDF charts of code pages, and they
    >> have the same problem: some show one GCGID for U+03B8, others show
    the
    >> other one.
    >
    > Wouldn't you be able to tell by the shape associated with the GCGID?

    There were no shapes to look at. The tables in the Unicode 1.0 book,
    and tables-in-electronic-form associated with Unicode 1.1, identified
    the Unicode character U+03B8 with GT610000 in the mapping tables for
    MS-DOS code page 869 and EBCDIC code page 875, and with GT610002 for
    Windows code page 1253 and various East Asian DBCS code pages.

    In both the Unicode 1.0 and 3.0 books, U+03B8 is represented with the
    "straight" theta glyph, while U+03D1 (not listed in any of the Unicode
    1.x tables) is represented with the "loopy" glyph.

    Markus's answer seems to indicate that GT61 is what really identifies
    the Greek lower-case theta. The "0001" suffix specifically calls for
    the loopy glyph and "0002" calls for the straight glyph, while "0000" is
    a generic suffix (exact glyph unspecified). But as I wrote in a
    separate message to Markus, it gets worse; there are other Unicode
    characters (mainly symbols) for which two or more *very* different
    GCGIDs are listed, depending on which reference source you use.

    It seems that GCGIDs predate any formal distinction between character
    and glyph of the type adopted by Unicode, making it somewhere between
    difficult and impossible to create a 1-to-1 mapping table between GCGIDs
    and Unicode

    -Doug Ewell
     Fullerton, California



    This archive was generated by hypermail 2.1.5 : Sat Oct 12 2002 - 17:36:32 EDT