From: Asmus Freytag (asmusf@ix.netcom.com)
Date: Wed Jul 14 2004 - 23:00:17 CDT
At 01:52 PM 7/14/2004, Doug Ewell wrote:
>It's not German data (with umlauts) that will be affected by this
>solution, but non-German data (with diaereses) in German bibliographic
>systems. That makes it a much smaller problem.
the use of diaeresis is perfectly valid for words in fields that have a
language ID 'German'.
>The DIN request and the USNB solution didn't address this, because the
>problem to be solved was disambiguating {a, o, u}-with-tréma from {a, o,
>u}-with-umlaut. If there are combinations of (for example)
>a-with-tréma-and-something-else AND ALSO
>a-with-umlaut-and-something-else, then those two will need to be
>disambiguated somehow. But I strongly doubt that the latter case exists
>in German bibliographic data, though of course one never knows.
First off, there have to be corresponding entries in the sorting tables
used for such data, to make that distinction have the correct effect. Since
the sorting tables would not support anything ohter than <BASE, CGJ,
DIAERESIS> there's no reason to introduce other sequences into the data.
Secondly, the dieresis is used to indicate that two vowels are pronounced
separately. I haven't seen a case where the vowels would already be accented.
Finally, one of the additional reasons that the phonetic sorting is
relevant in this instance, other than that the pronunciations are in fact
different, is that the use of diaeresis is not mandatory to the same degree
as for umlauts. You can find Kapernaum spelled with and without it, but if
you spell Hauser with it, it's the plural of Haus, without it it's a name.
Personal names however, sometimes are spelled with vowel + e (Moeller).
By sorting dieresis as a secondary difference, related terms do sort
together, and names sort near their variant spellings. The suggested
approach solves the problem at hand for those data where somebody took the
trouble to decide (on input) which was which, so that huge catalogs of
subject keywords or authors come out correctly.
Note, the bulk of all possible data in German won't make that distinction,
and won't be used on systems that support the special sorting method.
A./
This archive was generated by hypermail 2.1.5 : Wed Jul 14 2004 - 23:01:49 CDT