Re: Unicode character transformation through XSLT

From: Markus Scherer (markus.scherer@jtcsv.com)
Date: Tue Mar 11 2003 - 11:48:02 EST

  • Next message: John Hudson: "Re: Ligatures (qj)"

    Kenneth Whistler wrote:
    > "Unicode character (\uFFE2\uFF80\uFF93)"
    > ...
    > What you are actually looking for is the UTF-8 sequence:
    >
    > 0xE2 0x80 0x93

    The 8-bit UTF-8 bytes E2 80 93 (all with the most significant bit set) get *sign-extended* to 16
    bits, producing FFE2 FF80 FF93. It should suffice in a UTF-8 string literal to rewrite this as
    \xE2\x80\x93. Otherwise, find out where the 16-bit-widening/sign-extension occurs.

    markus



    This archive was generated by hypermail 2.1.5 : Tue Mar 11 2003 - 12:34:51 EST