Characters changing (was Collation TR)

From: Chris White (Chris.White@mail.bl.uk)
Date: Mon Jan 11 1999 - 06:35:24 EST


     
     
     Help! I am confused about when various pieces of software under
     various operating environments may quietly transform characters.
     
     To take bit one example:-
     On
       http://www.macchiato.com/mark/unicode/UTF8List.html
     the UTF-8 string for the Thai character cho ching is happily rendered
     by my browser (Netscape Navigator 3.0) as:
         a-grave cedilla per_mille_sign
     (and it comes out on the printer like that as well)
     which by my reckoning is the correct UTF-8 string for U+0E09.
     
     When I read the email from Mark Davis 8/1/99 14:19, that same string
     is rendered by my email (Lotus cc:Mail 6.3) as:
         a-grave comma percent_sign
     which is a mal-formed UTF-8 string anyway.
     
     Both the Netscape and the email are running under Windows 3.11.
     
     I am endeavouring to assess how much trouble this sort of quiet
     character transformation (or bit pattern transformation if you like)
     is going to give us in the BL as we move towards keeping our huge
     quantity of bibliographic data in Unicode.
     
     
     Chris White
     Systems Analyst
     The British Library
     
     



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:43 EDT