Re: ISO 10646 & GB18030 repertoire

From: Christopher Fynn (cfynn@gmx.net)
Date: Fri Jan 07 2005 - 14:22:13 CST

Next message: Philippe VERDY: "Re: Re: ISO 10646 & GB18030 repertoire"

Previous message: Mike Ayers: "RE: ISO 10646 & GB18030 repetoire"
In reply to: Mike Ayers: "RE: ISO 10646 & GB18030 repetoire"
Next in thread: Philippe VERDY: "Re: Re: ISO 10646 & GB18030 repertoire"
Maybe reply: Philippe VERDY: "Re: Re: ISO 10646 & GB18030 repertoire"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Mike Ayers <mike.ayers@tumbleweed.com> wrote:

> Second, and more importantly, since GB18030 does not encode all of
> Unicode, it cannot be considered a Unicode encoding form.

While it it isn't exactly a "Unicode encoding form" I thought that while
GB18030 is a primarily a superset of GBK it is also in effect a superset
of ISO 10646 in that it includes all characters in ISO 10646 (though at
different positions) and has more code positions than ISO 10646 & Unicode.

For instance the document "IBM Simplified Chinese Graphic Character Set,
GB 18030 code: National Standard and DBCS-Host" (2001) says:

| 4.4 GB 18030
| GB 18030, PRC National Standard, contains all char-
| acters defined in ISO 10646-1, but they have totally
| different code assignment. In GB 18030, one-byte,
| two-byte and four-byte encoding systems are adopted.
| The total capability is over 1.5 millions of code posi-
| tions. Currently, GB 18030 contains more than 27 000
| Chinese characters which have been defined in the
| latest version of ISO 10646-1.

And Meyer's GB18030 Summary
<ftp://ftp.oreilly.com/pub/examples/nutshell/cjkv/pdf/GB18030_Summary.pdf>
says:

| The Significant properties of GB18030 are
| o It incorporates Unicode's Unihan Extension A completly.
| o It provides code-space for all used and unused code points of
| Unicode's Plane 0 (BMP)and it's 16 additional planes if these
| code points were not already included in GBK.
| Expressed differently: while being a code- and character
| compatible "superset" of GBK, at the same time intends to
| provide space for all remaining code points of Unicode.
| Thus it effectively provides a 1-to-1 relationship between
| parts of GB 18030 and Unicode's complete encoding space.
...

- Chris

Next message: Philippe VERDY: "Re: Re: ISO 10646 & GB18030 repertoire"
Previous message: Mike Ayers: "RE: ISO 10646 & GB18030 repetoire"
In reply to: Mike Ayers: "RE: ISO 10646 & GB18030 repetoire"
Next in thread: Philippe VERDY: "Re: Re: ISO 10646 & GB18030 repertoire"
Maybe reply: Philippe VERDY: "Re: Re: ISO 10646 & GB18030 repertoire"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Fri Jan 07 2005 - 14:27:42 CST