Re: script detection program

From: Mark Davis (mark@macchiato.com)
Date: Thu Sep 26 2002 - 08:28:49 EDT

  • Next message: Peter_Constable@sil.org: "Re: Keys. (derives from Re: Sequences of combining characters.)"

    ICU doesn't have a tool specifically to do that, but it does have API
    support for that (and character conversion), so it'd be very simple
    for you to write such a tool -- just opening the file (with whatever
    conversion is required) and scanning the contents. See
    http://oss.software.ibm.com/icu.

    Mark
    __________
    http://www.macchiato.com
    ◄ “Eppur si muove” ►

    ----- Original Message -----
    From: "chuck clemens" <clemens900@hotmail.com>
    To: <unicode@unicode.org>
    Sent: Wednesday, September 25, 2002 21:02
    Subject: script detection program

    > Does anyone have a program or tool that can identify the scripts
    which the
    > characters in a UTF-16 encoded file belong to?
    >
    > I'd like a program that can scan the data and return script tag such
    as used
    > in http://www.unicode.org/unicode/reports/tr24/
    >
    > so if I had a UTF-16 encoding file with latin and cyrillic
    characters, the
    > tool/program would scan the text and return the name "latn" and
    "cyrl"
    >
    >
    >
    >
    > _________________________________________________________________
    > Send and receive Hotmail on your mobile device:
    http://mobile.msn.com
    >
    >



    This archive was generated by hypermail 2.1.5 : Thu Sep 26 2002 - 09:30:26 EDT