Re: GB 18030 Certification

From: Christopher Fynn (
Date: Sun Aug 21 2005 - 22:55:13 CDT

  • Next message: Peter Constable: "RE: Questions re ISO-639-1,2,3"

    Andrew West wrote:

    > If support for the Chinese "Set A" set of precomposed Tibetan stacks
    > is now a requirement for GB18030 certification, then I would have
    > thought that OpenType Tibetan fonts such as Xiamalaya and Tibetan
    > Machine Uni that already fully support Unicode Tibetan by means of
    > OpenType tables could be made GB18030 compliant by adding in extra
    > mappings from the PUA codepoints defined in Set A to the appropriate
    > glyphs in the font where available or by decomposing the PUA code
    > points using OpenType features.

    A single set of glyphs is fine but the lookups could be very complicated.
    Unless you always perform some kind of "normalization", if a single
    document is edited on diverse systems you could end up with something in
    kind of a mixed (partly pre-composed and partly "atomic" Unicode)
    encoding - or something in between. What happens when you add a single
    combining consonant to a precomposed consonant stack?

    Without normalization of some kind the font lookup tables needed
    to handle every possible way of encoding each stack could quickly
    become unmanagable and difficult to debug.

    I guess MS Windows at least will try to map every

    The PRC's precomposed / PUA encoding of Tibetan seems be designed to
    avoid the need for anything like OpenType shaping or "smart font"
    technology. Since they are used to huge CJK character sets and fonts,
    6,000+ pre-composed Tibetan "characters" may seem to make more sense to
    them than adding support for "smart" fonts and complex script shaping.

    IMO assigning PUA mappings to pre-composed combinations in existing OT
    fonts is not a good idea as it might only encourage the creation of
    documents with mixed encoding.

    > In principle, it should be fairly
    > straightforward to support both encoding mechanisms in a single
    > OpenType font using a single set of glyphs.

    You'd still need support for OT shaping which is what such encoding
    schemes seem designed to avoid.

    - Chris

    This archive was generated by hypermail 2.1.5 : Sun Aug 21 2005 - 23:01:42 CDT