From: Mark Davis (firstname.lastname@example.org)
Date: Mon Dec 17 2007 - 15:32:04 CST
This should really be directed to the ICU list.
ICU gives you a choice of whether to get an error and stop, or whether to
substitute a character (and what that is). So if you want to check for
actual FFFD characters in the input stream (as opposed to those that are
replacements for erroneous or missing sequences), you have the tools to do
On Dec 17, 2007 2:31 AM, erra srikrishna <email@example.com> wrote:
> Hi all,
> i need a clarification regarding Replacement character U+FFFD.
> According to Unicode Conformance C4, C5 & C6,
> If any non-characters, un-assigned & low or high surrogate codepoints
> are existed in unicode input then they should be skipped or Replaced with
> U+FFFD character
> According to Unicode Conformance Clause C12a,
> *An y Unicode (UTF8, UTF16 & UTF32) *application* *should not accept
> ill-formed code unit sequences from its input. It should either signal an
> error or represent the code unit with a marker such as U+FFFD (REPLACEMENT
> I am using IBM ICU and ICU uses FFFD as default replacement character. so
> i want to know if input itself contains U+FFFD character then how should we
> treat that character.
> I mean i want my application to return an error whenever above sequences
> are found and ICU by default replaces with FFFD. so here i am checking input
> for FFFD and concluding that some invalid sequence has occured that's why
> ICU replaced it with FFFD then generating error.
> But this will not be applicable for input actually with FFFD then in this
> what to do. whether to generate error or anything else. i didn't see any
> conformance clause specifying what should be done for FFFD.
> Here i am mainly convernec with UTF16 input.
> Srikrishna Erra
> *Krishna E*
> Now you can chat without downloading messenger. Click here<http://in.rd.yahoo.com/tagline_webmessenger_5/*http://in.messenger.yahoo.com/webmessengerpromo.php>to know how.
This archive was generated by hypermail 2.1.5 : Mon Dec 17 2007 - 15:34:48 CST