Re: New BMP characters (was Re: [very OT] Documentation: beyond

From: Jungshik Shin (
Date: Wed Feb 21 2001 - 14:31:27 EST

On Wed, 21 Feb 2001, Thomas Chan wrote:

> On Wed, 21 Feb 2001, Jungshik Shin wrote:
> > On Wed, 21 Feb 2001, Jungshik Shin wrote:
> > > On Wed, 21 Feb 2001, Werner LEMBERG wrote:
> > > > > South Korea's PKS 5700
> > > > This is a North Korean standard AFAIK.
> > >
> > > No. AFAIK, PKS stands for 'Proposed Korean Standard' and as such PKS 5700
> > > became KS C 5700 which in turn was renamed KS X 1005-1. Then, what is
> > > KS X 1005-1? It's just the Korean version of ISO 10646 (aligned with
> > > Unicode 2.0).
> >
> > I could be wrong in saying that PKS C 5700 became KS C 5700 although
> > it's (almost) certain that PKS represents 'Proposed Korean Standard'
> > (where Korean means South Korean). Unicode 3.0 (p. 259) lists two PKS
> > C's as K source 2 and K source 3 (PKS C 5700-1 1994 and PKS C 5700-2
> > 1994) and <>
> > lists PKS C 5700-3 1998 as another K source. What is this mysterious
> > PKS C 5700-[1-3]? I asked around in the past but haven't obtained the
> > definitive answer. Perhaps, I should ask someone in IRG.
> The unihan.txt file ver 3.0b1 (1999.7.2) lists four K- sources as:
> K0 KS C 5601-1987
> K1 KS C 5657-1991
> K2 PKS C 5700-1 1994
> K3 PKS C 5700-2 1994

K2 and K3 are what I meant by K source 2 and K source 3 above. BTW,
all these references to KS C 5xxx in Unicode/ISO 10646 should be replaced
with KS X 1xxx (or at least the new names should be given in parentheses).
(KS C 5601 -> KS X 1001, KS C 5657 -> KS X 1002). It's unfortunate that
it didn't make to TUS 3.0 because the name scheme change predated TUS 3.0.
I'll address this in a separate message.

> It's very clear what K0 and K1 are, and they are given as GR ranges
> arranged by pronunciation, and it is okay that these ranges overlap, since
> K0 and K1 are two different character sets.

Hmm, it's not a big deal but I wonder why they're given as GR ranges
instead of just row-column values (or GL). Somebody must have mixed
up .......

> K2 has what appears to be GL ranges given for it (0x2121 .. 0x7530), and
> arranged by radical+strokes. K3 looks similar, having what appear to be
> GL ranges (0x2121 .. 0x3771), arranged by radical+strokes, but they all
> fall within CJK Extension A. The ranges given for K2 and K3 also overlap.
> (They seem reminiscent of the "planes" of CNS 11643 / EUC-TW .)

By K2 and K3 overlapping, you do not mean some characters in Ext. B are
given references to both K2 and K3, do you? If not, it's natural and all
right by the same token you said about the overlap of K0 and K1 ranges
because it indicates that K2 and K3 have repertoirs disjoint from each
other (i.e. The intersection of K2 and K3 is a null set) just like K0
and K1 do.

> According to the 02n34428_cjk_b_fcd_mapping.txt file[1] (May 2000?), the
> K source (#4?) is given as a decimal number from 0002 .. 0269, arranged by
> radical+strokes, and all within CJK Extension B (but this file only deals
> with Ext B, so that doesn't mean much). There seem to be some gaps in the
> numbering, though. I'm not sure what to make of this in relation to K2
> and K3, or the whole "PKS C 5700" thing. The later date (1998 vs. 1994)
> must also be of some significance.

lists all the Korean industrial standards related with information
exchange and there's no trace of PKS C 5700-[1-3] (1994|1998). It might
be that South Korean representatives to the IRG came up with kind of
'make-shift' (thus independent of KS naming scheme) naming scheme for the
collection of CJK ideographs (used in South Korea) to submit to the IRG.
If that's the case, I think this 'PKS C 5700' thing could be only solved
by contacting South Korean representatives to the IRG.

BTW, at <>, Korean standard documents can be purchased
in PDF and in print). Unfortunately, not all standards are available
in PDF.

> [1] Available at

Thank you for the reference.

Jungshik Shin

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:19 EDT