Minor error in TR #16

From: Doug Ewell (dewell@compuserve.com)
Date: Wed Jun 14 2000 - 10:18:40 EDT


I found a small error in Technical Report #16, "UTF-EBCDIC."

In Section 3.5, "Signature," there is the following passage:

   The signature character U+FEFF (zero width no-break space) of Unicode
   transforms into the I8-byte sequence X'F1 BF B7 BF' which maps to
   X'DD 73 66 73' in UTF-EBCDIC. When this sequence is displayed
   (erroneously) using different a [sic] single-byte EBCDIC code pages,
   it can be visualized as different character strings. In Latin-1
   EBCDIC code page 1047 (and coincidentally also in Latin-1 code pages
   500 and 37), this byte sequence appears as "¨╦├╩" (small letter u
   with grave, capital letter E with diaeresis, capital letter A with
   tilde, capital letter E with circumflex).

If the 4-character I8-sequence contains two 0xBF bytes, and they both
map to 0x73 (as of course they must), then they will not be displayed
as the two different characters '╦' and '╩'. The text should read:

   ... this byte sequence appears as "¨╦├╦" (small letter u with grave,
   capital letter E with diaeresis, capital letter A with tilde, capital
   letter E with diaeresis).

The stray "a" in the passage which I marked with "[sic]" was left in for
accuracy, but it is not the error I was referring to. The TR contains
several such typos, so it would be unfair to single this one out.

-Doug Ewell
 Fullerton, California



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:03 EDT