Re: Level of Unicode support required for various languages

Date: Fri Oct 26 2007 - 11:43:29 CDT

  • Next message: James Kass: "Re: [unicode] Re: Level of Unicode support required for various languages"

    As Andrew,

    explains quite clearly below this is a case where uunicode got it
    correct. The difference is slight but very significant, even though
    confusing (I think I earlier got these two reversed). To unify these
    would be to change the language, which is not unicode's job.


    Quoting Andrew West <>:

    > On 26/10/2007, Mark E. Shoulson <> wrote:
    >> m
    >> >> (Examples exist, like U+3ADA (?) and U+66F6 (?).)
    >> >
    >> > The two charcaters have different meaning, though i have to admit I am
    >> > hard pushed to find a font that maintains the differnce in the bottom
    >> > half of the character &#26352; U+66F0 and &#26085; U+65E5 respectively.
    >> Yeah, and an "x" in English has a different meaning (sound) than an "x"
    >> in Spanish (letters "mean" sounds; Chinese graphs mean words. More or
    >> less). Yet we still encode them the same because they look the same.
    >> Unicode generally tries to code what's written more than what's meant, I
    >> thought.
    > These two characters *look similar*, and in many fonts it is difficult
    > to distinguish them clearly, but they are actually written with
    > different, *non-unifiable* components.
    > U+3ADA ?
    > Written with Radical 72 (RI4 ? "sun") and ? WU4 phonetic
    > Also written as U+6612 ? HU1
    > Pronounced HU1
    > Means "early morning, daylight" (hence the RI4 ? "sun" radical)
    > U+66F6 ?
    > Written with Radical 73 (YUE1 ? "speak") and ? WU4 phonetic
    > Also written as U+5FFD ? HU1
    > Pronounced HU1
    > An onomatopoeic word meaning "the sound of exhalation" (hence the YUE1
    > ? "speak" radical)
    > Unfortunately the similarity between these two characters means that
    > sometimes the wrong character is used. For example, in the on-line
    > version of the Kangxi Dictionary U+3ADA is used throughout for the
    > entry for U+66F6, and U+66F6 is used throughout for the entry for
    > U+3ADA !
    > <> (click on "Kangxi
    > Zidian" tab)
    > <> (click on "Kangxi
    > Zidian" tab)
    > (at least I think they've got the characters the wrong way round --
    > but maybe it's me and/or Unihan that is confused)
    > Andrew

    This message sent through Virus Free Email

    This archive was generated by hypermail 2.1.5 : Fri Oct 26 2007 - 11:45:27 CDT