Re: CCCII and CNS questions (& EACC facts)

From: Joan Aliprand (BR.JMA@rlg.org)
Date: Thu Jan 14 1999 - 16:12:26 EST


John Clews (10646er@sesame.demon.co.uk) commented on Werner Lemberg's
CCCII-CNS project:

[snip]

>It should also be remembered that CCCII closely maps to EACC - the
>American National Standard East Asian character code set for
>bibliographic use (ANSI Z39.???? - full number mislaid). Most
>character placements are the same in both EACC and CCCII: a
>relatively small number (100 or so??) are different.

EACC (in part) is a SuBSET of CCCII. Werner Lemberg wrote "the CCCII fonts
contain about 74000 glyphs". EACC contains over 13,000 ideographs, and
over 15,000 characters in all.

While most EACC characters do have the same encoding as their CCCII
equivalent, this is not true for all of them. EACC also includes PRC
characters, kana, Japanese kokuji, and hangul (over 2,000 characters) that
were not present in CCCII when EACC was developed by RLG.

ANSI/NISO Z39.64-1989, "East Asian Character Code for Bibliographic Use" is
the designation. It was reaffirmed in 1995. The code charts are published
on microfiche.

>Although an ANSI standard, this is published by NISO (National
>Information Standards Organization) in the USA rather than by ANSI
>itself (although I suspect ANSI would also sell copies as it's an
>ANSI standard).

NISO (http://www.niso.org) is accredited by ANSI to develop technical
standards for information services, libraries, and publishing. ANSI says
(http://www.ansi.org) "ANSI does not itself develop American National
Standards (ANSs); rather it facilitates development by establishing
consensus among qualified groups."

John Cowan <cowan@locke.ccil.org> added;

>John Clews wrote:
>
>> In relation to Unicode mapping (or not) it may be noted that an
>> earlier _draft_ of part of Unicode included a table showing the CJK
>> correspondences between Unicode and GB, CNS, JIS, KSC, _and_ EACC was
>> included, which was not included in later published versions of UCS
>> (ISO/IEC 10646 and Unicode).
>
>EACC mappings are available in the Unihan database at
>ftp://ftp.unicode.org/Public/UNIDATA/Unihan.txt , along with
>BigFive, CCCII, CNS 11643-1986 and -1992, GB 2312-80, GB 12345-90,
>GB 7589-87, GB 7590-87, "General Use Characters for Modern Chinese",
>GB 8565-89, IBM Japanese, JIS X 0208-1990, JIS X 0212-1990,
>KS C 5601-1989, KS C 5657-1991, and Xerox CCS mappings.

The EACC mappings in this resource are:
(a) only for ideographs,
(b) DRAFT, and
(c) only for those EACC ideographs that correspond 1:1 to a URO code point.
    ["URO" = Unified Repertoire and Ordering, also known as "Unified Han".]

>If CCCII is being documented in relation to CNS (which could
>therefore reflect CNS relationships with Unicode) it may be useful to
>include EACC relationships in this document too.

If the intent of Werner Lemberg's project is the same as Doug Schiffer's --
to identify ideographs that are not in Unicode/ISO 10646 -- then there is
no need to include EACC as there is a preliminary list of the EACC
characters that are not in the URO (but could well be in CJK Extension A).

Anyone working on identifying ideographs that are not in Unicode/ISO 10646
should make sure they know about the extensive work being done by the
Ideographic Rapporteur Group. The IRG is responsible to WG2 for the
analysis of ideographs. The IRG Web site is at
http://www.cse.cuhk.edu.hk/~irg/

-- Joan Aliprand
   Senior Analyst
   Research Libraries Group (RLG)

To: UNICODE@UNICODE.ORG



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:44 EDT