Re: Need help about unicode encoding with Perl !

From: John Delacour (JD@BD8.COM)
Date: Tue Sep 09 2003 - 09:43:05 EDT

  • Next message: Rajkumar S: "Re: Unicode 4.0.1 Beta period extended"

    At 11:01 am +0800 9/9/03, Hu Guoxin wrote:
    > > > hello everyone:
    >> >
    >> > I'm using Perl to develop a web-site.
    >>
    >> But you don't tell which version of Perl.
    >
    >it's Perl 5.6.1.

    If you get

    ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP936.TXT

    then you can build a table to work with. The
    script below simply prints the two values, but
    you can build a hash to do what you want. In
    Perl 5.8+ most people would use the Encode
    module, I suppose.

    #!/usr/bin/perl
    no warnings ;
    $f = "/Users/Shared/Downloads/CP936.TXT" ; # path to downloaded file
    $/ = /(\015\012|\015|\012)/ ? $1: "\n";
    open F, $f or die $!;
    for (<F>) {
       unless(/^#/) {
           s~\t\#.+$~~;
           s~0x~~g ;
              s~^([0-9A-F][0-9A-F])\t~chr( hex $1) . "\t"~e ;
           s~^([0-9A-F][0-9A-F])([0-9A-F][0-9A-F])~chr( hex $1) . chr(hex $2)~e ;
     
    s~\t([0-9A-F][0-9A-F])([0-9A-F][0-9A-F])~"\t" .
    chr(hex $1) . chr(hex $2)~e ;
           print
         }
    }

    > > >From this website
    >http://rf.net/~james/perli18n.html (the first
    >site Google
    >> finds for unicode perl web) I see that different versions of Perl have
    >> different levels of support for Unicode.
    >
    >I just want to know how to convert the encode method.
    >
    >Such as:
    >
    >"íÜçëêlÅB" is a sentence in GBK code.
    >(1)how to convert it into unicode?
    >(2)and how to convert it back?
    >(3)if the user's OS is Japanese , how to convert
    >the unicode message into Shift-JIS
    > as "íÜçëêl"?
    >
    >Thanks a lot!
    >



    This archive was generated by hypermail 2.1.5 : Tue Sep 09 2003 - 10:50:12 EDT