Re: Japanese EUC and Shift-JIS text samples?

From: Yung-Fong Tang (ftang@netscape.com)
Date: Mon Oct 04 1999 - 16:27:51 EDT


HTML could also be treat as plain text from converter point of view,
right ?

If so...;

http://home.netscape.com/ja for Shift_JIS
http://www.yahoo.co.jp/ for EUC-JP

Momoi- do you have better data ?

Frank da Cruz wrote:

> Does anybody have fairly large ftp-able samples of Shift-JIS
> (Code Page 982) plain text containing a "typical" mixture of
> halfwidth Roman, halfwidth Katakana, and Kanji? (Does anybody
> have an idea what the typical mixture might be over a very
> large sample of Japanese text?)
>
> Same question for Japanese EUC.
>
> And for that matter, also JIS-7.
>
> As far as I know, these are the only three commonly-used
> Japanese character sets (besides Unicode) that include both
> single- and doublewidth characters.
>
> (For working on conversion to/from Unicode, of course :-)
>
> - Frank





This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:53 EDT