Re: character groupings in various languages

From: Ben Dougall (bend@freenet.co.uk)
Date: Fri May 16 2003 - 12:38:21 EDT

  • Next message: Peter_Constable@sil.org: "Re: Unicode conformant character encodings and us-ascii"

    anyone? : uca and collation to ascertain various possible character
    groupings / catagorisations that are specific to various specified
    languages? to get some other matches, more than just an absolute match
    or not absolute match?

    am i on the right track there? or is there a better direction maybe?
    i'm looking for a reasonably even coverage of the main languages.

    just checking. thanks.

    On Thursday, May 15, 2003, at 11:03 pm, Ben Dougall wrote:

    > would it be the uca / collation
    > <http://www.unicode.org/unicode/reports/tr10/> that will allow me to
    > do this? :
    >
    > having specified which language is being used, compare one character
    > to another and find out which various groupings they may or may not
    > share. such as comparing in english, an 'F' and 'W' would match on
    > case (and consonants even). case catagories i'm sure don't exist in
    > some other languages, but then i'm sure there are many other types of
    > catagorisations in other languages that english doesn't have.
    >
    > i'd like to have access to any kind of character catagories /
    > groupings that maybe applicable to whichever language is initially
    > specified.
    >
    > is it the uca that's what i need to look into for that type of thing?
    >
    >
    > also i notice icu <http://oss.software.ibm.com/icu/> has a lot of
    > collation stuff. how does that compare to unicode's collation?, (if
    > collation is even what i'm after, that is). how is icu different from
    > unicode's collation?
    >
    > thanks.
    >
    >



    This archive was generated by hypermail 2.1.5 : Fri May 16 2003 - 13:36:13 EDT