Re: Umlaut and TrÃ©ma, was: Variation selectors and vowel marks

From: Asmus Freytag (asmusf@ix.netcom.com)
Date: Wed Jul 14 2004 - 23:00:17 CDT

Next message: Peter Kirk: "Re: Umlaut and Tréma, was: Variation selectors and vowel marks"

Previous message: Kenneth Whistler: "Re: Umlaut and Tréma, was: Variation sele ctors and vowel marks"
In reply to: Doug Ewell: "Re: Umlaut and TrÃ©ma, was: Variation sele ctors and vowel marks"
Next in thread: Peter Kirk: "Re: Umlaut and Tréma, was: Variation selectors and vowel marks"
Reply: Peter Kirk: "Re: Umlaut and Tréma, was: Variation selectors and vowel marks"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

At 01:52 PM 7/14/2004, Doug Ewell wrote:
>It's not German data (with umlauts) that will be affected by this
>solution, but non-German data (with diaereses) in German bibliographic
>systems. That makes it a much smaller problem.

the use of diaeresis is perfectly valid for words in fields that have a
language ID 'German'.

>The DIN request and the USNB solution didn't address this, because the
>problem to be solved was disambiguating {a, o, u}-with-trÃ©ma from {a, o,
>u}-with-umlaut. If there are combinations of (for example)
>a-with-trÃ©ma-and-something-else AND ALSO
>a-with-umlaut-and-something-else, then those two will need to be
>disambiguated somehow. But I strongly doubt that the latter case exists
>in German bibliographic data, though of course one never knows.

First off, there have to be corresponding entries in the sorting tables
used for such data, to make that distinction have the correct effect. Since
the sorting tables would not support anything ohter than <BASE, CGJ,
DIAERESIS> there's no reason to introduce other sequences into the data.

Secondly, the dieresis is used to indicate that two vowels are pronounced
separately. I haven't seen a case where the vowels would already be accented.

Finally, one of the additional reasons that the phonetic sorting is
relevant in this instance, other than that the pronunciations are in fact
different, is that the use of diaeresis is not mandatory to the same degree
as for umlauts. You can find Kapernaum spelled with and without it, but if
you spell Hauser with it, it's the plural of Haus, without it it's a name.
Personal names however, sometimes are spelled with vowel + e (Moeller).

By sorting dieresis as a secondary difference, related terms do sort
together, and names sort near their variant spellings. The suggested
approach solves the problem at hand for those data where somebody took the
trouble to decide (on input) which was which, so that huge catalogs of
subject keywords or authors come out correctly.

Note, the bulk of all possible data in German won't make that distinction,
and won't be used on systems that support the special sorting method.

A./

Next message: Peter Kirk: "Re: Umlaut and Tréma, was: Variation selectors and vowel marks"
Previous message: Kenneth Whistler: "Re: Umlaut and Tréma, was: Variation sele ctors and vowel marks"
Next in thread: Peter Kirk: "Re: Umlaut and Tréma, was: Variation selectors and vowel marks"
Reply: Peter Kirk: "Re: Umlaut and Tréma, was: Variation selectors and vowel marks"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Wed Jul 14 2004 - 23:01:49 CDT