Help! I am confused about when various pieces of software under
various operating environments may quietly transform characters.
To take bit one example:-
the UTF-8 string for the Thai character cho ching is happily rendered
by my browser (Netscape Navigator 3.0) as:
a-grave cedilla per_mille_sign
(and it comes out on the printer like that as well)
which by my reckoning is the correct UTF-8 string for U+0E09.
When I read the email from Mark Davis 8/1/99 14:19, that same string
is rendered by my email (Lotus cc:Mail 6.3) as:
a-grave comma percent_sign
which is a mal-formed UTF-8 string anyway.
Both the Netscape and the email are running under Windows 3.11.
I am endeavouring to assess how much trouble this sort of quiet
character transformation (or bit pattern transformation if you like)
is going to give us in the BL as we move towards keeping our huge
quantity of bibliographic data in Unicode.
The British Library
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:43 EDT