Re: Need program to convert UTF-8 -> Hex sequences

From: Dan Kogai (dankogai@dan.co.jp)
Date: Tue Mar 04 2003 - 14:25:08 EST

  • Next message: John Hudson: "Re: Caron / Hacek?"

    On Tuesday, Mar 4, 2003, at 07:59 Asia/Tokyo, David Oftedal wrote:
    > Hello!
    >
    > Sorry to make this a mass spam, but I need a program to convert UTF-8
    > to hex sequences. This is useful for embedding text in non-UTF web
    > pages, but also for creating a Yudit keymap file, which I'm doing at
    > the moment.
    >
    > For example, a file with the content æøå would yield the output
    > "0x00E6 0X00F8 0X00E5", and the Japanese expression あの人 would yield
    > "0x3042 0x306E 0x4EBA".
    >
    > Can anyone tell me how to do it without making a program for it
    > myself? It would be VERY helpful, and I've already made 2 programs for
    > assembling this file and I'm not starting on another just yet.

    Perl 5.8 allows you to do so in one liner;

    perl -MEncode -ple '$_=join(" ",map {sprintf "0x%04X", $_} unpack("U*",
    decode("utf8",$_)))'

    A more descriptive script is as follows;

    #
    use strict;
    use Encode;
    while(<>){
            chomp $_;
            my $line = decode("utf8" => $_);
            my (@chars) = unpack("U*" => $line);
            my (@hexed) = map {sprintf "0x%04X", $_} @chars;
            my $hexed = join(" " => @hexed);
            print $hexed, "\n";
    }
    __END__

    Even funkier example.

    #
    package Encode::Hex;
    use strict;
    use base qw(Encode::Encoding);
    __PACKAGE__->Define('hex');
    sub encode($$;$){
         my ($obj, $str, $chk) = @_;
         my @hexed =
             map {$_ == ord("\n") ? chr($_) : sprintf "0x%04X", $_}
                 unpack("U*" => $str);
         $_[1] = '' if $chk;
         return join(" " => @hexed);
    }
    package main;
    binmode STDIN => ":utf8";
    binmode STDOUT => ":encoding(hex)";
    while(<>){
         chomp;
         print $_, "\n";
    }
    __END__

    Dan the (Perl5 Porter|Encode Maintainer)



    This archive was generated by hypermail 2.1.5 : Tue Mar 04 2003 - 15:35:18 EST