Re: CNS 11643-1992 plane 15

From: Arne Götje (高盛華) (arne@linux.org.tw)
Date: Tue Jun 13 2006 - 20:14:37 CDT

  • Next message: Doug Ewell: "Re: triple diacritic (sch with ligature tie in a German dialect writing document)"

    On Wednesday 14 June 2006 00:47, John H. Jenkins wrote:
    > On Jun 13, 2006, at 8:21 AM, Werner LEMBERG wrote:
    > >> CNS11643 defined 16 plane space in 1986 but only define the first
    > >> two planes. Plane 14-16 was uesd during late 80 and earily 90 in
    > >> the EUC-TW system as experiemental or "private usage" in some
    > >> goverment system. The 1992 version define 7 planes.
    > >
    > > Until now everybody has said that CNS 11643 from 1992 has only
    > > seven planes. So the question is still unanswered what plane 15 in
    > > Unihan.txt actually refers to.
    >
    > Note that plane 15 occurs in the kIRG_TSource. The official IRG
    > sources on occasion differ from their printed counterparts, typically
    > to allow inclusion in Unicode of characters which have not yet been
    > officially standardized in their home country.
    >
    > What "plane 15" means in this case is that the Taipei Computer
    > Association, which owns CNS 11643, handed the IRG a set of mappings
    > which contained this mysterious plane. That's as much an answer as
    > you can get from the Unicode end of things. For more information,
    > you'll have to contact TCA.
    >
    > Note that the kCNS1992 field does *not* contain plane 15 mappings.

    From the CNS11643 website:
    http://61.60.106.73/eng/word.jsp#cns11643

    -------------------- snip -------------------------
    (2) User-defined Areas
            To cater for different types of Chinese information processing,
    CNS11643 has reserved character plane 12 to 15 for user-defined
    characters. Chinese characters or symbols that have yet to be
    classified as national standard characters are coded in this area based
    on user requirements.
    p to 48,027 Chinese characters are encoded in the amended and extended
    version of CN11643. The code has covered characters as defined in the
    four "Table of Standard Chinese Characters" namely in the categories of
    frequently used, less frequently used, rarely used and Chinese
    character variants. However, since the implementation of the on-line
    computerized Residency Information System, the characters used to
    construct the national population database have exceeded the national
    standard characters by some 30,000 characters used for names. To enable
    data transmission and interchange for this type of character codes, the
    EDPC, Executive Yuan temporarily defined the interchange codes in
    user-defined areas: Character Plane 15: Coding interval from 2121 to
    6D39 is encoded with 6,831 Chinese characters. Ideographs are sourced
    from the 15th character plane of the Residency Information System. EUC
    codes are used in the Residency Information System and the encoding
    principles of EUC codes are identical to those of CNS11643. For easier
    understanding, existing ideographs and definitions are used. However,
    amongst the 7,167 characters defined in character plane 15 of the
    Residency Information System, there are 2 self-repeating characters and
    336 repeated characters that were already included in the first 7 CNS
    character planes. To avoid the situation of "one word, two codes",
    repeated parts are deleted to save the Household Registration and
    Military Service departments from having to repetitively convert codes;
    the spaces originally occupied by repeated characters are left blank
    after deletion.
    -------------------- snip -------------------------

    BTW: CJK Extension C will contain characters from Plane 12-15 as well as
    some missing ones from Plane 3 AFAIR.

    Maybe after the release of CJK Extension C there will be a
    new "official" version of CNS 11643...

    Cheers
    Arne

    -- 
    Arne Götje (高盛華) <arne@linux.org.tw>
    PGP/GnuPG key: 1024D/685D1E8C
    Fingerprint: 2056 F6B7 DEA8 B478 311F  1C34 6E9F D06E 685D 1E8C
    Key available at wwwkeys.pgp.net.   Encrypted e-mail preferred.
    
    




    This archive was generated by hypermail 2.1.5 : Wed Jun 14 2006 - 02:32:02 CDT