RE: Unicode character transformation through XSLT

From: Jain, Pankaj (MED, TCS) (Pankaj.Jain@med.ge.com)
Date: Wed Mar 12 2003 - 17:15:20 EST

  • Next message: Markus Scherer: "Re: Unicode character transformation through XSLT"

    Hi Pim,
    Thanks for reply.
    I modified my program as per your suggestion(modified to byChunk&127) ,
    but this time I am getting strange numbers.

    here is value in database
    E8C ? 6 to 10

    and the value that i am getting in property file is..

    value=69566732980193254321161113249483277721223277

    But I need to get following string
    value= E8C \u2013 6 to 10

    For you information, variable type of chunk is int.

                    int chunk = 0;

                    while(rsResult.next())
                                    {
                                            /*Get the file contents from the
    value column*/
                                            ipStream =
    rsResult.getBinaryStream("VALUE");
                                            strBuf = new StringBuffer();
                                            while((chunk =
    ipStream.read())!=-1)
                                            {
                                                    byte byChunk = new
    Integer(chunk).byteValue();
                                                    strBuf.append((char)
    byChunk&127);
                                            }
            
    prop.setProperty(rsResult.getString("KEY"), strBuf.toString());
                                    }

    Let me know if I need to set set any property for property file.

    Thanks
    -Pankaj

                                    
    -----Original Message-----
    From: Pim Blokland [mailto:pblokland@planet.nl]
    Sent: Wednesday, March 12, 2003 12:34 PM
    To: Unicode mailing list
    Subject: Re: Unicode character transformation through XSLT

    Jain, Pankaj (MED, TCS) schreef:

    > while((chunk = ipStream.read())!=-1)
    > {
    > byte byChunk = new Integer(chunk).byteValue();
    > strBuf.append((char) byChunk);
    > }

    You don't say which type your "chunk" variable is, but the problem
    is definitely in the number of conversions you do.
    In this tiny piece of code you convert the input from (whatever
    "chunk" is) into Integer, then to byte and finally to char.
    As I understand it, char is a signed 16 bits type in Java; any of
    the others may be unsigned. Hence the problem. You can try stripping
    off the high bits after conversion to char (i.e. (byChunk&127) at
    the end) or try to circumvent all those conversions altogether.

    Pim Blokland



    This archive was generated by hypermail 2.1.5 : Wed Mar 12 2003 - 18:08:46 EST