Re: Chinese FVS? (was: RE: Cuneiform Free Variation Selectors)

From: Andrew C. West (
Date: Thu Jan 22 2004 - 05:51:11 EST

  • Next message: Markus Scherer: "Re: Unicode forms for internal storage - BOCU-1 speed"

    On Wed, 21 Jan 2004 11:13:33 -0700, John Jenkins wrote:
    > Granted, epigraphy is tough on plain text. As Unicode starts to deal
    > with dead scripts, we have to deal with the issues it raises.
    > Variation selectors are one way of doing it.

    Yes, but I'm delighted to see from document N2684 "Draft Agreement on Old Hanzi
    Encoding" that variation selectors are not the method proposed for dealing with
    archaic forms of the Han script. I think that encoding the Oracle Bone, Bronze
    Inscription and Small Seal pre-Han scripts separately from the modern Han script
    is definitely the right thing to do, although as glyph variation is an even
    bigger problem for the ancient unstandardised scripts than for the modern
    script, I wonder whether variation selectors might not play a role in the end

    As I'm currently working on a proposal for the deceased Jurchen script, which
    also has a problem with glyph variation (about a third of the 1,355 entries in
    the most recent Jurchen dictionary are simple glyph variants, many almost
    indistinguishable from one another), maybe someone on the UTC could give me some
    advice ? Should I :

    A. Stick to a strict character encoding model, and ignore glyph variants that
    have no semantic distinctions (as I did for Phags-pa).
    B. Indiscriminately code every glyph form that has ever been seen, on the basis
    that ghyph variants are given in a respected dictionary.
    C. Propose distinct characters, but append a long list of proposed standardised
    variants to cover the simple glyph variants (some missing a dot here or adding a
    stroke there, some written in a more cursive manner, and some just differently


    This archive was generated by hypermail 2.1.5 : Thu Jan 22 2004 - 06:33:36 EST