From: Otto Stolz (Otto.Stolz@uni-konstanz.de)
Date: Tue Nov 23 2004 - 10:16:37 CST
you have written:
> I tried UTF-8 export to send an e-mail that contained
> several scattered unicode codepoints from the full
> 16-bit range from oooo to ffff from XP+Word to the
> university's Linux/Mozilla/OpenOffice/Kmail, enabled
> UTF-8 support. With very disappointing results.
For UTF-8 (or any other encoding except ISO 646 IRV
(aka ASCII)) to survive the transport via e-mail
(RFC 2821), it must be tagged and "transfer-encoded",
according to RFC 2045, and RFC 2047. For examples, cf.
(in German). It is the e-mail clients' responsibilty
to do this tagging and encoding (on the sending side),
and the corresponding interpretation and decoding (on
the receiving side).
You have not mentioned, which e-mail client program
you have used, how it was configured, nor what the
result looked like. Hence, the cause of your "very
disappointing results" cannot be derived (nor even
> 1. Do I expect too much assuming that UTF-8 just
> recodes the full 16-range in 8-bit but that
> text-programs with UTF-8 enabled should be able to
> reconstruct the full 16-bit range (as far as used)?
The Unicode range is much more than 16 bit (you need 21 bits
per character, but all 21 bit values are not used).
UTF-8 encodes every single character in 1 through 4 bytes;
cf. <http://www.unicode.org/faq/utf_bom.html> for more
details. I do not understand what you mean by "recon-
struct", but I guess your question is answered in the
cited WWW page.
This archive was generated by hypermail 2.1.5 : Tue Nov 23 2004 - 10:21:57 CST