From: Zhang Weiwu (weiwuzhang@hotmail.com)
Date: Thu Feb 13 2003 - 07:56:46 EST
Are you tring to recognized it by eyes or in your program?
If the webpage is in unicode, it's hard to say. The bad thing is, unlike the "La", "The", "Die" in European languages, the most frequent ideographs in both Chinese text form are almost the same. Perhaps the ideograph for the meaning "for" (wei in Mandarin pinyin) is the most significant recognizable one.
The traditional one 70BA looks like:
http://www.unicode.org/cgi-bin/GetUnihanData.pl?codepoint=70BA
The simplified one 4E3A looks like:
http://www.unicode.org/cgi-bin/GetUnihanData.pl?codepoint=4E3A
And the most common measure word (a bit like the article "a" in English) is different.
The traditional 500B
http://www.unicode.org/cgi-bin/GetUnihanData.pl?codepoint=500B
The simplified 4E2A
http://www.unicode.org/cgi-bin/GetUnihanData.pl?codepoint=4E2A
If the webpage isn't in unicode, a simple rule is most traditional Chinese webpages are coded "Big5", most simplified Chinese webpages are coded "GB2312" or "GB18030".
Anyway, if you find a chunk of Chinese text looks complex, it is likely to be traditional.
=================
Zhang Weiwu from Xiamen China
----- Original Message -----
From: "Paul Hastings" <paul@tei.or.th>
To: <unicode@unicode.org>
Sent: Thursday, February 13, 2003 7:35 PM
Subject: traditional vs simplified chinese
> i suppose this is a really simple minded question but is there any way of
> telling if an incoming chunk of text (say from a browser form) is
> traditional or simplified chinese?
>
> thanks.
> ----------------------------------------------------
> Paul Hastings paul@tei.or.th
> Director Environmental Information Center
> Thailand Environment Institute
> Member Team Macromedia (Allaire)
> http://www.tei.or.th/eic ---------------------------
>
This archive was generated by hypermail 2.1.5 : Thu Feb 13 2003 - 08:37:35 EST