RE: Subset of Unicode to represent Japanese Kanji?

From: addison@inter-locale.com
Date: Sat Jul 15 2000 - 08:08:55 EDT


This assumes that the only support required is output of previously
translated internal messages. If *that's* the case, you can usually save
even more memory by using a run-length encoded "picture" of the bitmap on
the display (in fact, I suggested exactly that solution for a, shall we
say, "closely related product" a few years ago)... and they went with the
solution Mike suggests below instead. It's a clever solution.

This turned out to be a pain in the neck for all concerned.

1. It meant that the ROM font and character tables had to be revised
every time the software was localized. Given the limited real estate in an
LCD, this basically meant redoing the "font" map every time.
2. I had a printout of Shift-JIS I kept in a drawer with highlighter on it
to show what characters we had bitmaps of. The translators never knew
which code points they'd assigned: we had to track it all after the fact.
3. This leads to fallacious thinking on the part of the software
developers, who are keen to carve up the English into words and reuse the
words (eek! I have only one "the" in English... and how many forms in
German? French? Spanish?)
4. The software has to work completely differently for Japanese than for
the Latin-1 languages. This is not good: it leads to two code bases,
multiple test regimes, weird bugs.
5. You can never, ever display user input (unless limited to, say, kana or
romanji, or Latin-1+, or limited to canned messages)

The real-real answer is, as suggested yesterday, fix the device to support
the necessary requirements (more ROM, more display, whatever). Yes,
::sigh::, I know I defended the other side last night! But I18N should
start before the factory cranks out 50,000 devices....;-)... usually,
though, it is too late and we get to be clever, rather than right.

I still favor RLE in this instance, because it lets the localizers get the
absolute max out of the display real estate and ALL of the languages can
work the same way.

Of course, if Michael is driving the print head and not a display, none of
this course of discussion matters: he's having to print the characters he
gets.

Addison

=======================================================
Addison P. Phillips Principal Consultant
Inter-Locale LLC http://www.inter-locale.com
Globalization Engineering & Consulting Services

+1 408.210.3569 (mobile) +1 408.904.4762 (fax)
=======================================================

> =====SNIP=====8<-----
>
> Or you can be terminally practical...
>
> 1.) Write out all your Japanese strings, Kanji and all, as you wish
> them to appear on your display.
>
> 2.) Make a table containing only those characters, numbering them
> from 0 on up. Sort them by Unicode character number, low to high. This is
> your character list.
>
> 3.) Create an array indexed by the table entry that contains the
> Unicode character number in each poosition to match the character list.
> This is your lookup array.
>
> 4.) Create an array of bitmap information for the characters which
> is, of course, index matched to the lookup array. This is your bitmap
> array.
>
> 5.) Store you strings in the Unicode format of your choice.
>
> 6.) To display, simply get the Unicode value of each character,
> perform a binary search (fast!) on the lookup array, and get the bitmap info
> from the bitmap array to display.
>
> This could give you full Kanji support in very tight memory.
>
>
> HTH,
>
> /|/|ike
>



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:05 EDT