I found a small error in Technical Report #16, "UTF-EBCDIC."
In Section 3.5, "Signature," there is the following passage:
The signature character U+FEFF (zero width no-break space) of Unicode
transforms into the I8-byte sequence X'F1 BF B7 BF' which maps to
X'DD 73 66 73' in UTF-EBCDIC. When this sequence is displayed
(erroneously) using different a [sic] single-byte EBCDIC code pages,
it can be visualized as different character strings. In Latin-1
EBCDIC code page 1047 (and coincidentally also in Latin-1 code pages
500 and 37), this byte sequence appears as "����" (small letter u
with grave, capital letter E with diaeresis, capital letter A with
tilde, capital letter E with circumflex).
If the 4-character I8-sequence contains two 0xBF bytes, and they both
map to 0x73 (as of course they must), then they will not be displayed
as the two different characters '�' and '�'. The text should read:
... this byte sequence appears as "����" (small letter u with grave,
capital letter E with diaeresis, capital letter A with tilde, capital
letter E with diaeresis).
The stray "a" in the passage which I marked with "[sic]" was left in for
accuracy, but it is not the error I was referring to. The TR contains
several such typos, so it would be unfair to single this one out.
-Doug Ewell
Fullerton, California
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:03 EDT