Re: List of Japanese Shift_JIS characters which are not supported in Unicode

From: Philipp Reichmuth (reichmuth@web.de)
Date: Mon Oct 11 2004 - 10:24:55 CST

  • Next message: Theodore H. Smith: "Re: UTF-8 stress test file?"

    souravm schrieb:
    > Is there anywhere an exhaustive list of Japanese characters (especially
    > Shift_JIS characters) which are not supported in Unicode ?

    Are there any characters in Shift-JIS that are unrepresentable in
    Unicode? Are any of these non-vendor-specific? (In this case, you'll
    need a vendor-specific mapping table anyway).

    I'm not sure about JIS X 0213 and compatibility ideographs.

    You might take a look at
    http://www.opengroup.or.jp/jvc/cde/ucs-conv-e.html, even though it's
    probabaly outdated. Another interesting page might be
    http://www.debian.or.jp/~kubota/unicode-symbols.html; this guy generally
    seems to be a bit anti-Unicode.

    There are some round-trip mapping problems with a small number of
    characters due to ambiguities
    (http://support.microsoft.com/default.aspx?scid=kb;en-us;Q170559), but
    in this case the problem is the conversion FROM unicode, meaning that
    you can convert character X to character U(X) and back to J(U(X)), but
    you cannot guarantee that J(U(X)) = X in a small number of cases. AFAIK
    there are no "unsupported characters" on the back conversion, you might
    just end up at a different codepoint.

    Philipp



    This archive was generated by hypermail 2.1.5 : Mon Oct 11 2004 - 10:28:36 CST