Re: Documenting in Tamil Computing

From: Barry Caplan (
Date: Mon Dec 16 2002 - 13:29:14 EST

  • Next message: Asmus Freytag: "Re: Mongolian Encoding"

    At 08:32 PM 12/15/2002 -0500, Jungshik Shin wrote:
    >> because
    >> Unicode is not mature enough to be used in multilingual email yet.
    >> You just have to make do with the 8bit TSCII encoding for Tamil eMail.
    > I don't understand what you meant by Unicode not being
    >mature enough to support multilingual emails. Modern email clients like
    >Netscape7/Mozilla, MS Outlook (Express), and Mutt support UTF-8 very well.

    Actually, it is not Unicode which is nt mature enough. It is SMTP, the core mail transport protocol. It is not 8 bit clean. It is very clear in the RFCs that only 7bit data is allowed "over the wire".

    There are various extensions and kluges described in various RFCs (ESMTP, MIME, etc. ) but they are not universally implemented at the server transport layer, let alone at the client layer.

    So Unicode falls into a (very large) class of encodings that are not safe to pass over SMTP because they use 8 bits for the encoding of at least some characters.

    This is a well know problem, and some mail servers do not follow the SMTP RFC exactly in that they do not specifically strip the 8th bit of all data and turn it to 0. If you are lucky and all th e mail servers on the path between you and your recipient act this way, then 8 bit data will go through.

    But for arbitrary email from one address to another, you can't rely on it.

    Barry Caplan

    This archive was generated by hypermail 2.1.5 : Mon Dec 16 2002 - 13:57:25 EST