Gb2312 encoding

From: Viswanathan S (vichu@sasi.com)
Date: Wed Sep 20 2000 - 01:27:50 EDT


Hi ,

    In a HTML page encoded using "gb2312" character encoding ,
how to distinguish ASCII characters from gb2312 characters ?

    In Big5 encoding the First byte of the two byte character always
starts with the bit '1' . This can be used to distinguish ASCII
characters
from BIG5 characters .

    Whereas in Gb2312 I see that the First byte can have values of
ASCII characters ( e.g 0x2121 ) . When such a pattern occurs ,
how do i make out whether it is two ASCII bytes with values
0x21 OR one gb2312 character .

Regards ,

Viswanathan S



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:13 EDT