Re: Manchu/Mongolian in Unicode

From: Andrew C. West (andrewcwest@alumni.princeton.edu)
Date: Tue Oct 15 2002 - 09:55:16 EDT

  • Next message: John H. Jenkins: "Re: the carnival of lost souls"

    On Tue, 15 Oct 2002, "Stefan Persson" wrote:

    > That font also includes some characters mapped to the PUA: A € sign, and
    > several 漢 character, many of which look like radicals. Why? Is that
    > something that's also required by that law?
    >

    It's my experience that many fonts include gunk in the Private Use Area. A quick check of some of
    the CJK glyphs in the PUA of SimSun-18030 shows that they are not unique, but are also mapped to
    codepoints in the CJK Radical Supplement and CJK-A blocks for example.

    I believe that it is intended to maintain a one-to-one correspondence between the GB18030 standard
    and Unicode, and so there should be no need for any supplementary glyphs in the PUA.

    The new PRC law is, as you hint, overly restrictive and prescriptive, and is, I think, a serious
    setback for popularisation of Unicode on the Web. The intent is that GB18030 should replace GB2312
    and Big5, and so that instead of the current mishmash of GB2312 (SC) and Big5 (TC) websites, in the
    future Traditional and Simplified Chinese sites (at least those hosted in China) will use the same
    GB18030 encoding.

    Where does this leave websites written in Unicode Chinese ? Out in the cold !

    At present web pages written in Unicode Chinese (some of mine for example) are not being indexed by
    Google, and are ignored by both Yahoo China (SC) and Chinese Yahoo (TC). The situation will
    certainly not be improved by the replacement of GB2312 and Big5 with GB18030.

    Andrew



    This archive was generated by hypermail 2.1.5 : Tue Oct 15 2002 - 10:51:12 EDT