Re: Fun with UDCs in Shift-JIS

From: Lars Marius Garshol (larsga@garshol.priv.no)
Date: Fri Jan 18 2002 - 08:44:27 EST


* Addison Phillips
|
| According to Lunde (p. 205), the range is through F9FC. There are
| real characters in the range FA40 -> FC4B, at least in CP932, which
| may be causing you some confusion, since these have concrete
| mappings to Unicode (not just a mapping in the U+E000 range).

A very good point. I never found this discussion in Lunde (why on
earth is it not in the section that discusses the structure of
Shift-JIS?), so I didn't know this. So it seems that there is a
well-defined mapping for the characters FA40 - FC4B that duplicates
characters encoded elsewhere in Shift-JIS. Lunde does not have a
complete table for this range, however. Does anyone know of one?

The problem is what to do about the rest of the range. Lunde suggests
mapping to the Unicode PUA, but I don't think this is what the people
using these characters in web pages expect that mapping.

--Lars M.



This archive was generated by hypermail 2.1.2 : Fri Jan 18 2002 - 08:19:36 EST