Re: Mapping of SJIS control characters

From: Asmus Freytag (asmusf@ix.netcom.com)
Date: Mon Mar 23 2009 - 19:33:59 CST

  • Next message: Vinod Kumar: "writing direction"

    On 3/23/2009 5:29 PM, Kenneth Whistler wrote:
    > Tim Greenwood asked:
    >
    >
    >> This question really belongs in the ICU-support mail list, but I tried
    >> there and had no response. Some of the people who hang out here are
    >> good at answering these obscure questions.
    >>
    >> The mapping from SJIS to Unicode (as seen on
    >> http://www.icu-project.org/icu-bin/convexp?conv=ibm-943_P15A-2003&s=ALL
    >> ) has three odd conversions in the control range.
    >>
    >> 0x1A -> 0x1C
    >> 0x1C -> 0x7F
    >> 0x7F -> 0x1A
    >>
    >> I do not see anything equivalent in EUCJP mappings, nor can I find any
    >> reference that shows JIS201differing from standard practice in the
    >> control codes.
    >>
    It looks like a remapping of control-code values to other control code
    values. In H/W-based systems, the character ROM may have had images for
    Control codes. Chances are, those were not uniform for Japanese
    hardware. If so, this is a remapping that is intended to allow something
    like a terminal handler, which would be defined for one platform (e.g.
    ASCII-based terminals) and the ShiftJIS codes would be adjusted to "fit".

    In principle, the same could be true for true "device" codes, but I find
    that less likely. [The old OEM "dos" code pages had graphical images
    associated with control codes, and there's a way to tell the API to use
    them.]

    The reason control codes aren't necessarily mapped graphically, is that
    you may not be able to tell from the data stream whether to interpret
    them as characters (symbols) or as controls.

    Lacking access to terminal emulators or original H/W that's what I'd
    surmise.

    A./



    This archive was generated by hypermail 2.1.5 : Mon Mar 23 2009 - 19:36:23 CST