Re: Braille, CJK and unicode

From: Jeroen Ruigrok van der Werven (asmodai@in-nomine.org)
Date: Sat Jan 31 2009 - 17:16:50 CST

  • Next message: Samuel Thibault: "Re: Braille, CJK and unicode"

    -On [20090131 23:32], Samuel Thibault (samuel.thibault@ens-lyon.org) wrote:
    >Here I do not really care about how things are pronounced, but what they
    >_mean_.

    But then still, a single kanji may have 1-5 meanings, and used in a compound
    it may get a whole different meaning.

    Take for example the kanji for love, ai (U+611B, 愛), it can also be read as
    megumi, being a female's name, just as it can be ai, a female's name (and
    about 15-20 other female names), as well as, ai, the concept of love or
    affection.

    Then I won't mention stuff like okurigana and the likes yet, where the
    addition of hiragana at the end of the kanji can indicate yet another
    meaning of a word.

    So unless you use a morphological analyser (mecab, chasen), you are going to
    lose a lot of information if you insist on a raw one to one mapping from
    kanji to English.

    -- 
    Jeroen Ruigrok van der Werven <asmodai(-at-)in-nomine.org> / asmodai
    イェルーン ラウフロック ヴァン デル ウェルヴェン
    http://www.in-nomine.org/ | http://www.rangaku.org/ | GPG: 2EAC625B
    Earth to earth, ashes to ashes, dust to dust...
    


    This archive was generated by hypermail 2.1.5 : Sat Jan 31 2009 - 17:18:56 CST