David Starner, normally <firstname.lastname@example.org> but on this occasion
> I was having some problems with a test of my SCSU decoder recently,
> and I discovered it was due to my decoder rejecting 10FFFF as a valid
> Unicode value (because it ends in FFFF.) The fourth test pattern,
> Section 9.4 of Tech Report 6 (SCSU) uses DBFF DFFF as a surrogate
> pair, which is 10FFFF. Is this wrong, or is there something I'm
Good question. Unicode scalar values ending in FFFE and FFFE do not
represent valid characters, but by definition D29 (recently clarified
for me) a UTF must encode and decode these values. SCSU is not a UTF,
but my guess is that this requirement should apply to SCSU as well.
I think the SCSU decoder should go ahead and decode the 0B BF FF and
subsequent 15 FF as U+10FFFF, and leave the job of deciding which values
are valid or invalid to the higher-level process that interprets them.
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:06 EDT