Re: Level of Unicode support required for various languages

From: Andrew West (andrewcwest@gmail.com)
Date: Fri Oct 26 2007 - 10:12:29 CDT

  • Next message: David Starner: "Re: Level of Unicode support required for various languages"

    On 26/10/2007, Mark E. Shoulson <mark@kli.org> wrote:
    >m
    > >> (Examples exist, like U+3ADA (?) and U+66F6 (?).)
    > >
    > > The two charcaters have different meaning, though i have to admit I am
    > > hard pushed to find a font that maintains the differnce in the bottom
    > > half of the character &#26352; U+66F0 and &#26085; U+65E5 respectively.
    > Yeah, and an "x" in English has a different meaning (sound) than an "x"
    > in Spanish (letters "mean" sounds; Chinese graphs mean words. More or
    > less). Yet we still encode them the same because they look the same.
    > Unicode generally tries to code what's written more than what's meant, I
    > thought.

    These two characters *look similar*, and in many fonts it is difficult
    to distinguish them clearly, but they are actually written with
    different, *non-unifiable* components.

    U+3ADA 㫚
    Written with Radical 72 (RI4 日 "sun") and 勿 WU4 phonetic
    Also written as U+6612 昒 HU1
    Pronounced HU1
    Means "early morning, daylight" (hence the RI4 日 "sun" radical)

    U+66F6 曶
    Written with Radical 73 (YUE1 曰 "speak") and 勿 WU4 phonetic
    Also written as U+5FFD 忽 HU1
    Pronounced HU1
    An onomatopoeic word meaning "the sound of exhalation" (hence the YUE1
    曰 "speak" radical)

    Unfortunately the similarity between these two characters means that
    sometimes the wrong character is used. For example, in the on-line
    version of the Kangxi Dictionary U+3ADA is used throughout for the
    entry for U+66F6, and U+66F6 is used throughout for the entry for
    U+3ADA !

    <http://www.zdic.net/zd/zi/ZdicE6Zdic9BZdicB6.htm> (click on "Kangxi
    Zidian" tab)
    <http://www.zdic.net/zd/zi2/ZdicE3ZdicABZdic9A.htm> (click on "Kangxi
    Zidian" tab)

    (at least I think they've got the characters the wrong way round --
    but maybe it's me and/or Unihan that is confused)

    Andrew



    This archive was generated by hypermail 2.1.5 : Fri Oct 26 2007 - 10:15:00 CDT