Japanese EUC and Shift-JIS text samples?

From: Frank da Cruz (fdc@watsun.cc.columbia.edu)
Date: Mon Oct 04 1999 - 12:17:15 EDT


Does anybody have fairly large ftp-able samples of Shift-JIS
(Code Page 982) plain text containing a "typical" mixture of
halfwidth Roman, halfwidth Katakana, and Kanji? (Does anybody
have an idea what the typical mixture might be over a very
large sample of Japanese text?)

Same question for Japanese EUC.

And for that matter, also JIS-7.

As far as I know, these are the only three commonly-used
Japanese character sets (besides Unicode) that include both
single- and doublewidth characters.

(For working on conversion to/from Unicode, of course :-)

- Frank



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:53 EDT