Re: accented Latin characters sort order, non-language dependant

From: Jukka K. Korpela (jkorpela@cs.tut.fi)
Date: Mon Jul 10 2006 - 07:23:08 CDT

  • Next message: Andreas Prilop: "Re: accented Latin characters sort order, non-language dependant"

    On Mon, 10 Jul 2006, Cristian Secar wrote:

    > I have to make an spreadsheet with a few accented characters and their
    > coverage for a few languages. How do I sort them alphabetically ?

    Using the Unicode Collation Algorithm,
    http://www.unicode.org/reports/tr30/
    would appear to be suitable here, since the context is multilingual.

    In practice, if you have just a few characters, you could check their
    mutual order from
    http://www.unicode.org/charts/collation/

    > I know that this is highly language dependant, but I also remember that
    > once I've been told about an (Unicode ?) document with an abstract sort
    > order of many (Latin ?) characters. I cannot remember what document that
    > was - is this something [well] known ?

    The algorithm is a separate standard issued by the Unicode Consortium, and
    it can be used either as such (typically, in multilingual contexts) or as
    "lowest level algorithm", possibly with several layers of locale-specific
    rules above it. If you have data in some particular language, with a few
    words with foreign characters, you could use the sorting rules of that
    language as the "higher level" algorithm, falling back to the Unicode
    Collation Algorithm for characters not covered by it.

    -- 
    Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/
    


    This archive was generated by hypermail 2.1.5 : Mon Jul 10 2006 - 07:29:31 CDT