Re: how to sort by stroke (not radical/stroke)

From: Dan Kogai (dankogai@dan.co.jp)
Date: Tue May 13 2003 - 15:16:52 EDT

  • Next message: Dan Kogai: "Re: how to sort by stroke (not radical/stroke)"

    On Tuesday, May 13, 2003, at 11:48 PM, John Jenkins wrote:
    >> Stroke order, then, is something
    >> different. Seems like we would need order entries in the config
    >> data
    >> for every character, which would be totally unmanageable.
    >>
    >> I didn't have any luck searching the Unicode web site for information
    >> about sorting by stroke.
    >>
    >
    > There is a kTotalStrokes field in Unihan.txt, although it doesn't
    > cover every character in Unihan. This would definitely be a good
    > place to start.

    If you are using Perl 5.6.0 or higher (5.8.0 recommended), you can use
    Unicode::Unihan module available via CPAN. Let me show you a small
    example.

    #!/usr/local/bin/perl
    use strict;
    use Unicode::Unihan;
    my $uh = Unicode::Unihan->new;
    my $str = "\x{5c0f}\x{98fc}\x{5f3e}"; # my name in Kanji
    my @chars = map {chr($_)} unpack("U*" => $str);
    my @strokes = $uh->TotalStrokes($str);
    my %c2s; @c2s{@chars} = @strokes;
    binmode STDOUT => ':utf8';
    for my $char (sort {$c2s{$a} <=> $c2s{$b} || $a cmp $b} @chars){
         print "$char => $c2s{$char}\n";
    }
    __END__

    And here is what it prints.

    $B>.(B => 3
    $BCF(B => 12
    $B;t(B => 14

    I am not sure if Unicode::Unihan is robust enough for the practical use
    but IMHO it is a handy place to start.

    Dan the Perl5 Porter



    This archive was generated by hypermail 2.1.5 : Tue May 13 2003 - 16:21:58 EDT