Re: Katakana and Kanji (was: Re: interleaved ordering (was RE: Phoenician))

From: Benjamin Peterson (
Date: Tue May 11 2004 - 08:14:14 CDT

  • Next message: Mark E. Shoulson: "Re: Phoenician"

    On Mon, 10 May 2004 16:29:25 -0700 (PDT), "Kenneth Whistler"
    <> said:
    > Stefan Persson wrote:
    > > Mike Ayers wrote:
    > > > I have not seen
    > > > katakana joined to kanji (or romaji), and suspect that such does not occur.
    > >
    > > There are a few cases, e.g. ソ連 (So-Ren: Soviet Union), but that could
    > > also be written as two kanji as 蘁E (which is however very rare in
    > > modern Japanese).
    > It's actually quite common, depending on how you choose
    > to construe "joined".

    Indeed, and in addition to the examples you give there are at least three
    more cases:

    1 -- increasingly, people use katakane in _half_ of a jukugo (multi-kanji
    word) because one of the kanji is too hard to remember. The result needs
    to be treated as a single word but it is part kanji and part katakana.

    2 -- in the past, the roles of hiragana and katakana were less well
    defined, and many things (inflections, particles) that would be in
    hiragana now were in katakana. This results in words that are kanji +
    katakana suffixes.

    3 -- some expressions, such as the placename 'kasumigaseki', have a
    katakana in just for the heck of it.

    There are also a number of common situations in which romaji are commonly
    mixed with katakana or kanji. I think it is impossible to make any rigid
    rules about what combination the four scripts can occur in.

    Luckily, as Japanese typography allows line breaks anywhere except in the
    middle of a romaji word and next to some punctuation marks, life is still
    bearable. Morphological analyzers should ignore whitespace completely
    and accept that a 'word' can span any combination of scripts.


      Benjamin Peterson

    This archive was generated by hypermail 2.1.5 : Tue May 11 2004 - 08:14:48 CDT