Re: mapping Shift-JIS user-defined chars to Unicode

From: Geoffrey Waigh (anzu@home.com)
Date: Fri Jun 11 1999 - 12:31:59 EDT


Masahiko Maedera wrote:
>
> Shift-JIS has 1880 characters for private use.
> And they are mapped into Unicode as the following way.
>
> Shift-JIS Unicode
> 0xF040 - 0xF0FC (excluding 0xF07F) ---> U+E000 - U+E0BB
> 0xF140 - 0xF1FC (excluding 0xF17F) ---> U+E0BC - U+E177
> ... same way ...
> 0xF940 - 0xF9FC (excluding 0xF97F) ---> U+E69C - U+E757

Well perhaps some applications map them that way, since U+E000
is the start of the Unicode *Private* Use block; but every
Unicode implementation is entitled to assign other semantics
to them.

To make it clear, I know of at least one implementation where
the first few hundred codepoints of the PUA are assigned for
internal purposes. Of course it has convenient methods for
mapping non-Unicode glyphs to other parts of the PUA and back,
but people relying on the previously discussed hexadecimal
input method would have to know the local mapping.

> *** However, This is a very important notice. ***
>
> Now JCS-WG2 (http://jcs.aa.tufs.ac.jp/jcs/) is prohibiting
> to use all of these area privately.
> and new JIS characters (called JIS level 3rd, 4th) will be
> mapped in these area. Some of these characters may be mapped
> into CJK Extension A (U+3400 - U+4DFF).
> So developers should create new mapping table for them.

When Unicode 3.0 is released people will happily extend their
mapping tables to accomodate the new characters. If JCS-WG2
is redefining this private mapping as your message suggests
I think that makes it abundantly clear why third parties
cannot be expected to implement their private use ideas -
they break compatibility with different revisions of their
own standards much less what anyone else on the planet might
be doing.

Geoffrey Waigh



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:46 EDT