UTF-8 ill-formed question

From: James Lin <James_Lin_at_symantec.com>
Date: Tue, 11 Dec 2012 11:16:28 -0800

Does anyone know why ill-form occurred on the UTF-8? besides it doesn't follow the pattern of UTF-8 byte-sequences, i just wondering how or why?
If i have a code point: U+4E8C or "二"
In UTF-8, it's "E4 BA 8C" while in UTF-16, it's "4E8C". Where is this "BA" comes from?


Received on Tue Dec 11 2012 - 13:23:51 CST

This archive was generated by hypermail 2.2.0 : Tue Dec 11 2012 - 13:23:53 CST