> Francois M Richard wrote:
> > Can Unicode conformance be applied to rtf (and how)?
Newer Microsoft products (from Office 97 onwards?) seem to use constructs
of the form
\uXXXX\'YY to encode Unicode characters, where XXXX is the *decimal*
Unicode value and YY is a replacement character in ANSI as an alternative
for non-Unicode-aware readers. The rtf source text itself is encoded in
7bit Ascii, and the codepage used to interpret the \'YY commands is
specified somewhere in a command in the header.
This is the method apparently used by many Windows applications
internally to exchange Unicode data, e.g. through the clipboard. Just save
a sample Word document with some Unicode characters to rtf to see how it
There's more details on this somewhere
in the MSDN library, under "Specifications/Applications/Rich Text Format".
As for html, you can either embed Unicode character entities of the form
&#NNNN; in an otherwise 8bit source text, or have the whole source text in
UTF-8 (This is probably rather over-simplified, I guess... :-)
Hope this helps,
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:18 EDT