Re: accented Latin characters sort order, non-language dependant

From: Otto Stolz (
Date: Tue Jul 11 2006 - 11:15:33 CDT

  • Next message: Jonathan Coxhead: "Re: accented Latin characters sort order, non-language dependant"

    Philippe Verdy schrieb:
    > the German sort order is very well established since long
    > which interprets the umlaut as a E,
    > and not like an optional diaeresis diacritic.

    My point was (and still is) that this is not *the* German sort order,
    but rather *a possible* German sort order.

    > Note that dictionnaries always exhibit the complete orthography
    > and not an abbreviated form, so the umlaut would be always present
    > and shown; that's a good reason why it is possible to treat
    > "ü" after "u" and not with "ue".

    The well-established sort order used in dictionaries and
    encyclopedieas, and complying with DIN 5007-1, treats
    (in 1st level) "ü" together with "u" (not after it). This
    is also the preferred order for most sorting purposes (except
    phone directories).

    The "'ü' after 'u'" order is not normally used in Germany,
    and I was surprised to read in that Wikipedia page I had quoted,
    that it is applied in Austrian phone books.

    > But I note that german nouns that start with the "Über..." prefix
    > sort them as "Ueber..." and not between "Ub..." and "Uc..."

    Where have you observed this order?

    In all German dictionaries and encyclopedias I know of,
    "über..." would go between "U-Bahn" and "U-Boot", and
    definitely not after "UdSSR".

    > (the presence of the leading CAPITAL is an important
    > distinction, because a umlaut over a capital U is often
    > difficult to see distinct from a capital U without umlaut).
    > Sorting it as "UE" makes a visual clue that the umlaut
    > is present and required.

    Still, German dictionaries and encyclopedias do not work
    in this way; they rather tend to choose a font that makes
    the Umlaut clearly visible.

    > But it was only an example of how language independant
    > handling of diacritics is not as simple as simply
    > dropping diacritics from the primary sort order
    > (or primary collation level) for the Latin script.

    I fully agree to this remark; yet, I have felt being com-
    pelled to get your example right and set it in proportion.

    Best wishes,
       Otto Stolz

    This archive was generated by hypermail 2.1.5 : Tue Jul 11 2006 - 11:19:41 CDT