reg: Unicode Comformance - - - - U+FFFD

From: erra srikrishna (erra_krishna@yahoo.co.in)
Date: Mon Dec 17 2007 - 04:31:30 CST

Next message: Mark Davis: "Re: reg: Unicode Comformance - - - - U+FFFD"

Previous message: Ngwe Tun: "Re: Burmese Typewriter Keyboard"
Next in thread: Mark Davis: "Re: reg: Unicode Comformance - - - - U+FFFD"
Reply: Mark Davis: "Re: reg: Unicode Comformance - - - - U+FFFD"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Hi all,

  i need a clarification regarding Replacement character U+FFFD.

  According to Unicode Conformance C4, C5 & C6,
       If any non-characters, un-assigned & low or high surrogate codepoints are existed in unicode input then they should be skipped or Replaced with U+FFFD character

       According to Unicode Conformance Clause C12a,
  An y Unicode (UTF8, UTF16 & UTF32) application should not accept ill-formed code unit sequences from its input. It should either signal an error or represent the code unit with a marker such as U+FFFD (REPLACEMENT CHARACTER).

  I am using IBM ICU and ICU uses FFFD as default replacement character. so i want to know if input itself contains U+FFFD character then how should we treat that character.

  I mean i want my application to return an error whenever above sequences are found and ICU by default replaces with FFFD. so here i am checking input for FFFD and concluding that some invalid sequence has occured that's why ICU replaced it with FFFD then generating error.

  But this will not be applicable for input actually with FFFD then in this what to do. whether to generate error or anything else. i didn't see any conformance clause specifying what should be done for FFFD.

  Here i am mainly convernec with UTF16 input.

  Thanks

  Regards
  Srikrishna Erra

Krishna E

---------------------------------
Now you can chat without downloading messenger. Click here to know how.

Next message: Mark Davis: "Re: reg: Unicode Comformance - - - - U+FFFD"
Previous message: Ngwe Tun: "Re: Burmese Typewriter Keyboard"
Next in thread: Mark Davis: "Re: reg: Unicode Comformance - - - - U+FFFD"
Reply: Mark Davis: "Re: reg: Unicode Comformance - - - - U+FFFD"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Mon Dec 17 2007 - 10:59:37 CST