In a HTML page encoded using "gb2312" character encoding ,
how to distinguish ASCII characters from gb2312 characters ?
In Big5 encoding the First byte of the two byte character always
starts with the bit '1' . This can be used to distinguish ASCII
from BIG5 characters .
Whereas in Gb2312 I see that the First byte can have values of
ASCII characters ( e.g 0x2121 ) . When such a pattern occurs ,
how do i make out whether it is two ASCII bytes with values
0x21 OR one gb2312 character .
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:13 EDT