Re: Benefits of Unicode

From: Lukas Pietsch (pietsch@mail.uni-freiburg.de)
Date: Sun Jan 28 2001 - 16:41:21 EST


>
> Francois M Richard wrote:
> >
> > Can Unicode conformance be applied to rtf (and how)?
> >
Newer Microsoft products (from Office 97 onwards?) seem to use constructs
of the form
\uXXXX\'YY to encode Unicode characters, where XXXX is the *decimal*
Unicode value and YY is a replacement character in ANSI as an alternative
for non-Unicode-aware readers. The rtf source text itself is encoded in
7bit Ascii, and the codepage used to interpret the \'YY commands is
specified somewhere in a command in the header.
This is the method apparently used by many Windows applications
internally to exchange Unicode data, e.g. through the clipboard. Just save
a sample Word document with some Unicode characters to rtf to see how it
works.
There's more details on this somewhere
in the MSDN library, under "Specifications/Applications/Rich Text Format".

As for html, you can either embed Unicode character entities of the form
&#NNNN; in an otherwise 8bit source text, or have the whole source text in
UTF-8 (This is probably rather over-simplified, I guess... :-)

Hope this helps,

Lukas



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:18 EDT