UTF-8 ill-formed question

From: James Lin <James_Lin_at_symantec.com>
Date: Tue, 11 Dec 2012 11:16:28 -0800

Hi
Does anyone know why ill-form occurred on the UTF-8? besides it doesn't follow the pattern of UTF-8 byte-sequences, i just wondering how or why?
If i have a code point: U+4E8C or "二"
In UTF-8, it's "E4 BA 8C" while in UTF-16, it's "4E8C". Where is this "BA" comes from?

thanks
-James

Received on Tue Dec 11 2012 - 13:23:51 CST

This message: [ Message body ]
Next message: vanisaac_at_boil.afraid.org: "Re: UTF-8 ill-formed question"
Previous message: announcements_at_unicode.org: "Unicode Collation Proposed Update"
In reply to: Edwin Hoogerbeets: "Question about normalization tests"
Next in thread: vanisaac_at_boil.afraid.org: "Re: UTF-8 ill-formed question"
Maybe reply: vanisaac_at_boil.afraid.org: "Re: UTF-8 ill-formed question"
Maybe reply: Doug Ewell: "RE: UTF-8 ill-formed question"
Maybe reply: James Lin: "Re: UTF-8 ill-formed question"
Reply: Otto Stolz: "Re: UTF-8 ill-formed question"

Mail actions: [ respond to this message ] [ mail a new topic ]
Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]

This archive was generated by hypermail 2.2.0 : Tue Dec 11 2012 - 13:23:53 CST