I have been studying Technical Report #6 on the Standard Compression
Scheme for Unicode*, and I am running into a problem that perhaps one
of the gurus on this list can explain for me.
In the example for Russian, the compressed data begins with an SC7
tag (0x17), which maps the subsequent characters 0x80 through 0xFF
into the default position of (dynamic) window 7, as the accompanying
text points out.
However, according to Table X-5, the default offset for window 7 is
0xFF00. Window 2, on the other hand, does default to offset 0x0400
and would seem to be the correct window for Cyrillic (and is
identified as such in the table). The proper tag would then be SC2
Am I missing something, or is there an error in the technical report?
* What's wrong with the shorter and more straightforward "Standard
Unicode Compression Scheme," anyway? Someone got a problem with
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:42 EDT