RE: Emails in Chinese

From: Marco Cimarosti (marco.cimarosti@essetre.it)
Date: Wed Oct 03 2001 - 11:02:57 EDT


> Hello friends at Unicode,

Hello David.

> I am wondering if you could tell me why I can send an
> E-mail in Chinese characters to a friend in China who
> can recieve it clearly, but when they write me in
> Chinese I receive a scrambled message that doesn't
> resemble Chinese writing. I am currently using universal
> translator 2000 supported by unicode to send them E-mails
> in Chinese. If they don't use unicode could this be why
> their messages are scrambled to me? If so, please let me
> know what I must do to set them up for proper
> communication. Your time and consideration is greatly appreciated.

E-mail messages contain a declaration called MIME header, which contains
information about the format of the message, including which character set
is used. The relevant line normally looks like this:

        Content-Type: text/plain; charset="XXX"

where XXX is the name of the character set.

There are several "charsets" that can be used Chinese. Here are the most
common of them, along with an example of content-type line:
        
        GB-2312 EUC-encoded, People's Republic of China, simplified
characters:
                Content-Type: text/plain; charset="gb2312"

        GB-2312 HZ-encoded, People's Republic of China, simplified
characters:
                Content-Type: text/plain; charset="hz-gb-2312"

        Big Five, Republic of China = Taiwan, traditional characters:
                Content-Type: text/plain; charset="BIG5"

        Unicode UTF-8-encoded, international, both traditional and
simplified:
                Content-Type: text/plain; charset="utf-8"

Now, chances are that:

1) your friend's e-mail software do not declare their charset (in this case,
ask them to "set the MIME header, hoping (s)he knows what to do");

2) your e-mail software does not understand MIME headers (in this case,
change it!);

3) your e-mail software does not understand that charset (in this case,
upgrade it with a module for that charset or, if not possible, change it);

In any case, if you know what the charset is, you can copy and paste the
text into an application that allows you to manually set the text encoding.
Many word processors, Internet browsers, or e-mail clients can do this.

If you don't know which charset it is, you can try sending a single line
from your friend's message on the Unicode mailing list: several people here
can recognize an encoding at a glance.

You can also try in sequence the four charsets that I listed above: it is
very unlikely that your friend used something else.

Hoping this helps...

_ Marco



This archive was generated by hypermail 2.1.2 : Wed Oct 03 2001 - 09:38:59 EDT