Question on Unicode data files

From: Marco Cimarosti (marco.cimarosti@essetre.it)
Date: Mon Feb 26 2001 - 09:49:05 EST


The Unicode FTP site (ftp://ftp.unicode.org/Public, now temporarily remapped
on http://www.unicode.org/Public) contains several files with mappings of
East Asian character sets to/from Unicode.

Are all these sources in sync? If not, which ones is it better to trust?

- UNIDATA/CJKXREF.TXT (containing Big-5, CCCII-1, CNS-1, CNS-2, CNS-E,
EACC=ANSI-Z39-64-89, GB-0=2312-80, GB-1=12345-90, GB-3=7589-87,
GB-5=7590-87, GB-7=GUCfMC, GB-8=8565-89, JIS-0=X-0208-90, JIS-1=X-0212-90,
JIS-IBM, KS-C-0=5601-87, KS-C-1=5657-1991, KSC-IBM, Xerox)

- MAPPINGS/EASTASIA/EASTASIA/CJKXREF.TXT (containing same mappings as above)

- MAPPINGS/EASTASIA/EASTASIA/UNIHAN.TXT
(Big-5, CCCII, CNS-86, CNS-92, EACC, GB-0=2312-80, GB-1=12345-90,
GB-3=7589-87, GB-5=7590-87, GB-7=GUCfMC, GB-8=8565-89, IBM-Japan,
JIS-0=X-0208-90, JIS-1=X-0212-90, KSC-0=5601-89, KSC-1=5657-1991,
Pseudo-GB-1=12345-90, Telegraph-PRC, Telegraph-Taiwan, Xerox)

- MAPPINGS/EASTASIA/EASTASIA/GB/GB12345.TXT
- MAPPINGS/EASTASIA/EASTASIA/GB/GB2312.TXT

- MAPPINGS/EASTASIA/EASTASIA/JIS/JIS0201.TXT
- MAPPINGS/EASTASIA/EASTASIA/JIS/JIS0208.TXT
- MAPPINGS/EASTASIA/EASTASIA/JIS/JIS0212.TXT
- MAPPINGS/EASTASIA/EASTASIA/JIS/SHIFTJIS.TXT

- MAPPINGS/EASTASIA/EASTASIA/KSC/HANGUL.TXT
- MAPPINGS/EASTASIA/EASTASIA/KSC/JOHAB.TXT (containing
KS-X-1001-97=KS-C-5601-92)
- MAPPINGS/EASTASIA/EASTASIA/KSC/KSC5601.TXT
- MAPPINGS/EASTASIA/EASTASIA/KSC/KSX1001.TXT
- MAPPINGS/EASTASIA/EASTASIA/KSC/OLD5601.TXT

- MAPPINGS/EASTASIA/EASTASIA/OTHER/BIG5.TXT
- MAPPINGS/EASTASIA/EASTASIA/OTHER/CNS11643.TXT

Moreover, directory UNIDATA contains <UnicodeData.txt> and
<UnicodeData-Latest.txt>. They seem to always be identical (same date &
time, same size).

Which one of them is the official Unicode database, and what is the other
one for?

Thanks.
_ Marco



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:19 EDT