Re: browser encoding settings

From: Philippe Verdy (
Date: Tue May 31 2005 - 12:12:18 CDT

  • Next message: Страхиња Радић: "Re: Glagolitic in Unicode 4.1"

    True also in France: some servers were initially configured to accept emails
    using 8-bit encodings, and later they were reverted to only accept 7-bit

    Many mail servers in Japan only support 7-bit emails, because they were
    tweaked locally since long to support Shift-JIS, and not reconfigured later
    to support other 8-bit encodings with something else than the ugly MIME
    7-bit transfer encoding syntaxes.

    As for Indian charsets, there's no other better supported encoding (ISCII is
    rarely supported in most browsers, mail agents or webmail servers), the only
    choice that remains is then UTF-7.

    But some webmails servers also do not support UTF-7, but only UTF-8, so
    users reading their emails online will be disapointed to see messages
    bargles with unreadable sequences like "+AO7-"... I think that Urdu readers
    need to use POP3 email agents, or choose a webmail service that do support
    the decoding of UTF-7 (in addition to UTF-8).

    The alternative then is to use UTF-8 with a MIME 7-bit transfer encoding
    syntax (quoted printable). Note that Base64 would probably be more
    efficient, but some mail servers reject all Base64-encoded emails, because
    they think they only contain binary attachments which are thought to be
    undesirable for security (simply because Base64 is used most often for those
    binary attachments).

    If I had to send Urdu emails, I would choose UTF-8 with Quoted-Printable...
    Ugly because this is an inefficient encoding (so emails are larger), but at
    least it works on most platforms. Now the recipients need a browser or email
    agent capable of displaying Urdu texts (this is a separate issue: if your
    email is in Urdu, you can expect that users capable of reading this language
    have set up an environment with fonts and renderers suitable for the
    extended Arabic script, and Bidi rendering).

    A more efficient encoding would use the ISO-8859 Latin+Arabic charset also
    with Quoted-Printable (but here again, the Latin-Arabic charset is not
    commonly supported by many webmail agents).

    BiDi text rendering is also an issue: if your email is plain text, not all
    email agents will render it properly (and BiDi override controls defined in
    Unicode are too much often ignored in many console applications, as they
    have no equivalent in legacy Arabic charsets). If you use HTML instead, you
    could alternatively use a "visual" encoding order for characters, using the
    <BDO> HTML override. This will complicate the composition of your email

    ----- Original Message -----
    From: "Paul Hastings" <>
    To: <>
    Sent: Tuesday, May 31, 2005 7:36 AM
    Subject: Re: browser encoding settings

    > Dean Harding wrote:
    >> Like most character set conversions, they probably convert it from
    >> whatever the source encoding is to some form of Unicode (usually whatever
    >> is most convenient for the platform), and then into whatever output
    >> encoding they wanted (in this case UTF-8).
    > i'm not sure that's true for yahoo. we've had numerous headaches sending
    > utf-8 mail to their users. from what we were able to tease out of their
    > html it looks like the encoding is dependent on "where" the yahoo mail
    > server is. some "US" servers don't seem to have any html encoding hints at
    > all, "Chinese" servers seem to use GB2312, etc. users have had to manually
    > swap their browser's encoding, usually messing up the rest of the yahoo
    > content around the email. we couldn't find any official yahoo docs on this
    > (though maybe we didn't look hard enough or in the right places). we more
    > or less gave up on it and included an idiotic "if you can't read this
    > email...." tag.

    This archive was generated by hypermail 2.1.5 : Tue May 31 2005 - 12:13:09 CDT