RE: Subset of Unicode to represent Japanese Kanji?

From: Ayers, Mike (Mike_Ayers@bmc.com)
Date: Fri Jul 14 2000 - 16:39:58 EDT


> From: addison@inter-locale.com [mailto:addison@inter-locale.com]
> Sent: Saturday, July 15, 2000 5:09 AM
>
> This assumes that the only support required is output of previously
> translated internal messages. If *that's* the case, you can
> usually save
> even more memory by using a run-length encoded "picture" of
> the bitmap on
> the display (in fact, I suggested exactly that solution for
> a, shall we
> say, "closely related product" a few years ago)... and they
> went with the
> solution Mike suggests below instead. It's a clever solution.

        Thanks. Storing the bitmaps of the entire string might be better
for a very small number of display strings. My solution was designed for an
intermediate number of strings where memory would be saved by encding
multiply-used characters one time each.

> This turned out to be a pain in the neck for all concerned.
>
> 1. It meant that the ROM font and character tables had to be revised
> every time the software was localized. Given the limited real
> estate in an
> LCD, this basically meant redoing the "font" map every time.

        Of course, but the problem posed was for only a single localization
- Japanese.

> 2. I had a printout of Shift-JIS I kept in a drawer with
> highlighter on it
> to show what characters we had bitmaps of. The translators never knew
> which code points they'd assigned: we had to track it all
> after the fact.

        That's bad project management, to be blunt. I made a few
assumptions about a "memory constrained device" - one of those assumptions
was that it would probably be PROM based, and therefore all of the software
would be swapped on an upgrade. Then you would only need to keep track of
the characters you were using for each individual software version. If the
bitmap array and display strings were to be amanaged independently, I agree
that this would not be a good way to go.

> 3. This leads to fallacious thinking on the part of the software
> developers, who are keen to carve up the English into words
> and reuse the
> words (eek! I have only one "the" in English... and how many forms in
> German? French? Spanish?)
> 4. The software has to work completely differently for
> Japanese than for
> the Latin-1 languages. This is not good: it leads to two code bases,
> multiple test regimes, weird bugs.

        Again, I was presented with a single locale problem.

> 5. You can never, ever display user input (unless limited to,
> say, kana or
> romanji, or Latin-1+, or limited to canned messages)

        Agreed.

> The real-real answer is, as suggested yesterday, fix the
> device to support
> the necessary requirements (more ROM, more display, whatever). Yes,
> ::sigh::, I know I defended the other side last night! But I18N should
> start before the factory cranks out 50,000 devices....;-)... usually,
> though, it is too late and we get to be clever, rather than right.

        Let's not propose one-size-fits-all solutions. If Michael i porting
an English app to Japanese and "limited memory" tramslates to "let's cram
Japanese into the same footprint as English", then I agree with you. If,
however, he is designing a small footprint, Japanese only, imminent
obsolosence device (think pagers), then I would tend to disagree.

> I still favor RLE in this instance, because it lets the
> localizers get the
> absolute max out of the display real estate and ALL of the
> languages can
> work the same way.

        For multinational devices, I tend to agree, but the memory
requirements for such a method can get severe as the number of possible
display strings increase.

/|/|ike



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:05 EDT