Re: GB18030 mapping

From: Christopher Fynn (
Date: Fri Jan 07 2005 - 10:01:56 CST

  • Next message: Christopher Fynn: "Re: ISO 10646 & GB18030 repertoire"

    Andrew C. West wrote:

    > Well it all depends. A text editor might import a GB18030 document with BrdaRten
    > SetA characters, and using the code point mapping tables convert AAA1 etc. to
    > U+E000 etc. The user then selects a BrdaRten font that maps precomposed BrdaRten
    > glyphs to U+E000 etc. and everything is displayed correctly. This kind of
    > support does not need any modifications to the mapping tables as the mapping of
    > U+E000 to <0F40, 0F74> is irrelevant ... PUA characters are just PUA characters,
    > and if you have the right font these PUA character will be rendered as
    > precomposed BrdaRten glyphs.
    > Of course if you then want to treat these PUA characters as real Unicode Tibetan
    > you need to know the character mapping, but from my perspective character
    > mapping is something that is optionally applied on top of the code point
    > mapping.

    As soon as you want to edit the text in a Unicode based application
    you'd probably need to convert (or "character map") the BrdaRten PUA
    characters to "real Unicode" [or you might end up with the horrors of a
    kind of mixed encoding]. Comparing text from "real Unicode" with
    precomposed Tibetan (PUA or GB18030), and collation would be difficult
    without conversion as well.

    - Chris

    This archive was generated by hypermail 2.1.5 : Fri Jan 07 2005 - 10:08:15 CST