From: Kenneth Whistler (email@example.com)
Date: Mon Mar 23 2009 - 18:29:37 CST
Tim Greenwood asked:
> This question really belongs in the ICU-support mail list, but I tried
> there and had no response. Some of the people who hang out here are
> good at answering these obscure questions.
> The mapping from SJIS to Unicode (as seen on
> ) has three odd conversions in the control range.
> 0x1A -> 0x1C
> 0x1C -> 0x7F
> 0x7F -> 0x1A
> I do not see anything equivalent in EUCJP mappings, nor can I find any
> reference that shows JIS201differing from standard practice in the
> control codes.
I wouldn't expect that behavior at all to derived from JIS X 0201
or EUC-JP. And I know I certainly don't support that kind of
mapping for any SJIS variety.
> I know that Unicode no longer supports these mapping tables, and even
> when it did the SJIS table does not define these ranges.
And I don't think it derives from any SJIS mapping ever posted
on the Unicode website.
> Can anyone shed any light on this issue?
My best guess is that this is an empirical mapping based
on testing actual mapping behavior on one (or more)
And I suspect what is involved is some weird backwards compatibility
issue having to do with the implementation of Ctrl-Z EOF marks
in MS-DOS. 0x1A is always strange, because its 6429 definition
is SUB, but it saw its widest usage in the CP/M --> MS-DOS line
of OS development as an EOF mark.
Could be wrong, though. ;-)
This archive was generated by hypermail 2.1.5 : Mon Mar 23 2009 - 18:32:26 CST