Re: Subset of Unicode to represent Japanese Kanji?

From: 11digitboy@bolt.com
Date: Wed Jul 12 2000 - 12:18:20 EDT


Correct me if I'm wrong, but I think you NEED kana
for Japanese. How can you even write "desu" ("is")
without it??

Am I right in supposing that Japanese people hate
that their kana take up 3 bytes per character, while
the Roman letters I am using now take up only 1 byte
apiece? If I were Japanese, I'd say this sucks.

To give an example of a katakana word that is an
international word: "nidoran".

--
Robert Lozyniak
Accusplit pedometer, purchased about 2000a07l01d19h45mZ,
has NOT FLIPPED
My page: http://walk.to/11
11digitboy@bolt.com - email
(917) 421-3909 x1133 - voicemail/fax

---- Otto Stolz <Otto.Stolz@uni-konstanz.de> wrote: > Am 2000-07-11 um 7:02 h hat Michael Martin geschrieben: > > English, Dutch, French, German, Italian, Japanese, > Portuguese, and Spanish. > > It is my understanding that all of these languages > except Japanese can be > > supported with the basic Latin and Latin Supplement > subset of Unicode > > (U+0000 ... U+00FF [...]). > > Latin-1 was invented to support those languages, > but falls short of doing > so, adequately. You will need additional characters > from the following > ranges: > - Latin Extended A (e. g. U+0152 and U+0153 for > French, U+0133 for Dutch, > perhaps U+017F for German (if you want to cover > Fraktur fonts, that is)) > - general punctuation (e. g. U+201E and U+201A > for German; U+2019 and > probably U+2010 through U+2015 for all of those > Languages) > - Currency Symbols (at least U+20AC, perhaps also > U+20A3, U+20A4, > U+20A7; note also U+00A3, U+00A5 in the Latin-1, > and U+0192 in the > Latin Extended-B regions, respectively) > - Depending on the application envisaged, you may > also wish to include > characters from the following areas: > - Number Forms (U+2150 through U+218F), particularly > fractions > - Arrows (U+2190 through 21FF); Box Drawing, > Block Elements, and > Geometric Shapes (U+2000 through U+25FF) > - Mathematical Operators and Miscellaneous technical > (U+2200 through > U+23FF); Miscellaneous Symbols and Dingbats > (U+200 through U+27BF) > - Depending on the technolgy used, you may have > to include characters > from the following ranges: > - Superscripts and Subscripts (U+2070 through > U+209F) > - Presentation Forms (e.g. ligatures U+FB00 through > U+FB06) > - The Replacement Character U+FFFD > to name just a few :-) > > Good starting points for your consideration could > be > - the EES, cf. <http://www.egt.ie/standards/ees.html>, > - Microsoft's WGL 4 character set, cf. > <http://www.microsoft.com/typography/OTSPEC/WGL4.htm>. > > > The Japanese I must support is the Kanji form. > [...] I cannot support > > Unicode in its entirety due to memory constraints. > > If I am not mistaken, Kanji is ideographic characters, > which would take > the lion's share of memory to implement. Probably, > you have to support > kana (hiragana or katakana). > > I do not know Japanese, so others may jump in. > > Best wishes, > Otto Stolz >

___________________________________________________________________ Get your own FREE Bolt Onebox - FREE voicemail, email, and fax, all in one place - sign up at http://www.bolt.com



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:05 EDT