From: Peter Kirk (firstname.lastname@example.org)
Date: Thu Aug 14 2003 - 15:30:19 EDT
On 14/08/2003 11:44, Jim Allan wrote:
> Peter Kirk posted:
>> The documentation is great, but I have had some problems copying text
>> from it (with Acrobat Reader 5), in particular with text in small
>> capitals e.g. Unicode character names. For example, I get the following
>> from p.44:
>> The sequence of Unicode characters U+0061 “a”
>> + U+0308 “!” + U+0075 “u”
>> unambiguously encodes “äu” not “aü”.
> This came out perfectly on my Windows 98 system as browsed by me in
> the Unicode list archives through Mozilla 1.3 and also after I pasted
> it into the Mozilla Compose window as quoted text.
> The characters, small capital or others, are displayed with no problems.
> Jim Allan
What seems to be happening, in Windows 2000, is that the text on the
clipboard is made up of PUA character codes U+F7XX, where the XX seems
to be the corresponding ASCII code. For example, small caps "LATIN"
comes out as F76C F761 F774 F769 F76E. At some point Windows 98 simply
strips off the F7's giving you the correct text. But Windows 2000, which
is Unicode based, keeps the full PUA code points, which in my Mozilla
1.4 are rendered as strange combinations of base characters with
combining marks, e.g. "LATIN" comes out as which appears on my
screen (in Mozilla mail and browsing the archives with Mozilla) as N
diaeresis M macron o vertical-line-below n macron o acute dot-below.
When I browse the archives in IE6 or paste the text into Word, I get
-- Peter Kirk email@example.com (personal) firstname.lastname@example.org (work) http://www.qaya.org/
This archive was generated by hypermail 2.1.5 : Thu Aug 14 2003 - 16:00:08 EDT