Re: Unihan : Traditional characters having two simplified equivalents

From: John H. Jenkins (jenkins@apple.com)
Date: Wed Jan 07 2009 - 16:25:20 CST

  • Next message: Asmus Freytag: "Re: Emoji: emoticons vs. literacy"

    There isn't a detailed explanation anywhere of how this was done, I'm
    afraid. This data is largely derived from data donated to Unicode by
    Wenlin, Inc., and they would probably be the best ones to contact to
    get information on how these specific instances were generated in the
    first place.

    Meanwhile, if you have a reliable source that indicates that we have
    mapping information which is wrong or incomplete, you can report it to
    us and we'll take the appropriate action.

    Meanwhile, looking over the instances you cite, it looks like some of
    these are simply wrong. For example, the mapping between 鯰 and 鲇,
    does appear to be relying on synonyms and should probably not be
    included in the Unihan database as an instance of simplification.

    On Dec 29, 2008, at 7:57 AM, koxinga wrote:

    >
    > Hi !
    >
    > I am wondering how the Unicode Consortium selected the traditional-
    > simplified pairs. Is there somewhere a detailed explanation of how
    > it was done ?
    >
    > More precisely, I don't understand the rationales behind the seven
    > "one traditional character-two simplified characters" links.
    >
    > * 瀋: 沈, 渖
    > * 畫: 划, 画
    > * 鍾: 钟, 锺
    > * 靦: 腼, 䩄 (U+4A44)
    > * 餘: 余, 馀
    > * 鯰: 鲇, 鲶
    > * 鹼: 硷, 碱
    >
    > According to what I found :
    > * "餘: 余, 馀" exists because of possible ambiguities between 餘
    > and 余, which already existed in traditional chinese.
    > * "瀋: 沈, 渖" and "鍾: 钟, 锺" exist because of conflicting
    > general rules (審 -> 审 and 釒 -> 钅) and specific
    > simplifications.
    > * the other four are links to synonymous characters or old variants.
    > For example "鯰: 鲇, 鲶". 鯰 -> 鲶 and 鮎 -> 鲇 are ok, it is
    > the general rule. 鯰 and 鮎 were synonym, but is it a reason to add
    > 鯰 -> 鲇 (and why 鮎 -> 鲶 doesn't exist ?).
    >
    > My 新华字典 gives me :
    > * 瀋: 沈
    > * 畫: 画
    > * 鍾: 钟
    > * 餘: 余, 馀 (with a note explaining why)
    > * 鹼: 硷
    > It does not write the simplification following general rules so
    > these two are implied :
    > * 靦: 䩄 (U+4A44)
    > * 鯰: 鲶
    >
    > Are these exceptions in Unicode because of older character set
    > compatibilities ? Are there some specific reasons ? Are these
    > reasons explained somewhere ?
    >
    > thanks,
    >
    > Koxinga
    >
    >
    >



    This archive was generated by hypermail 2.1.5 : Wed Jan 07 2009 - 16:27:20 CST