Re: CJK question

From: Eric Rasmussen (kerasmus@mac.com)
Date: Sun Mar 23 2003 - 10:38:55 EST

  • Next message: Pim Blokland: "Re: Several BOMs in the same file"

    > From: "Allen Haaheim"
    > ... it seems that "[s]ince GB18030 is fully ISO 10646 compatible, it
    > readily supports CJK Extension B and other languages." I don't have
    > the GB18030 font or Extension B Charset in my machine. Can I load CJK
    > Extensions A and B without switching to XP? I would prefer to use
    > Win2000, or the ME which I am running now, but if necessary I can use
    > XP.

    In terms of CJK glyphs, the GB 18030 character set only includes the
    CJK Ideographs and CJK Ideographs Extension A blocks. In terms of
    encodings, it supports the Supplementary Ideographic Plane (and thus
    CJK Ideographs Extension B), but no SIP characters are currently
    defined in GB 18030. So the "SimSun 18030" font that comes with the GB
    18030 support package is not a misnomer: it does contain the complete
    character set.

    Windows 2000 will do, but you must have Office XP. I have Windows 2000
    with Office XP with the 2002 Proofing Tools installed. The extended
    font is called "SimSun (Founder Extended)" [filename: SURSONG.TTF] and
    contains around 64,000 CJK Ideographs: most of Extension B, but not
    all. I have access to this entire font via the Simplified Chinese
    "Enhanced Unicode IME" which has already been mentioned, via the UTF-16
    code (not the scalar value).

    > From: "Allen Haaheim"
    > I tried what you suggested with unipad, but for some reason it went to
    > a location on a PUA character map, rather than CJK Unified Ideographs
    > Extension B, where they are in fact located.

    It goes to a PUA location because that is where that character is
    located in the appropriate CHANT font. The two examples you gave go to
    E596 and E58E, as has been noted. The correct characters are in the
    "ICS1" font. I would be curious to know how many of the glyphs in ICS1
    are now in Unicode, and which are not. ICS3 has a very nicely and
    accurately rendered set of oracle-bone and early bronze-inscription
    characters, by the way.

    Regards, Eric Rasmussen



    This archive was generated by hypermail 2.1.5 : Sun Mar 23 2003 - 11:06:39 EST