Re: Umlaut and Tréma, was: Variation selectors and vowel marks

From: busmanus (
Date: Wed Aug 04 2004 - 14:52:07 CDT

  • Next message: Doug Ewell: "Re: Umlaut and Tréma, was: Variation selectors and vowel marks"

    Marcin 'Qrczak' Kowalczyk wrote:

    > W liście z pią, 23-07-2004, godz. 18:01 +0200, Philipp Reichmuth
    > napisał:
    >>However, to return to the original problem, I don't remember ever having
    >>seen a data where it would be necessary to distinguish between trema and
    >>diaeresis in the data itself.
    > A similar issue: a Polish encyclopaedia I have from 1985 sorts words
    > with Ó differently depending on whether this is Polish Ó (sorted between
    > O and P, like other Polish letters are after letters without accents)
    > or foreign Ó (folded with O, like other foreign accents are folded).
    > It's typeset in the same way.
    > MÓR [mo:r], city in Hungary
    > MORA
    > MÓRA [mo:ro] Ferenc, Hungarian writer
    > [...]
    > MÓR (a Polish word)
    > [...]
    > MÓŻDŻEK (a Polish word)

    The context is somewhat different in these two cases though: in the case
    of Umlaut vs. Tréma, the distinction is between two different
    well-defined functions of the same diacritic that traditional German
    scholarship is aware of (if by no other reason, at least because of the
    influence of the rather significant body of Classical Greek scholarship
    that Germany produced), even if the use of one them is foreign to
    lexical items of native German vocabulary.

    In the case of Ó in Polish, there is the native function (using Ó
    to write an U that is etymologically connected with an O, if I'm not
    mistaken) on one side, and there are all the non-native functions
    (Hungarian Ó denoting a long O, Spanish Ó denoting an accented O - this
    may be the case with Portuguese Ó as well, but I'm not sure -, and then
    there's the Icelandic and and Irish Gaelic Ó, which may have a fourth
    and a fifth function), all grouped together on the other side.

    Although I'm not aware of sorting native and non-native Ó any
    differently in Hungarian encyclopedias, but it may happen in one or two
    of the other languages I listed (or in yet others I'm not aware of). I'm
    rather prone to think that using e.g. a plain COMBINING ACUTE vs.
    CGJ + COMBINING ACUTE is a dangerous way of approaching problems like
    Polish vs. non-Polish Ó. The relevant point here seems to be the
    language the word is in (I understand Unicode also has standard language
    markers defined in its inventory).



    Miert fizetsz az internetert? Korlatlan, ingyenes internet hozzaferes a FreeStarttol.
    Probald ki most!

    This archive was generated by hypermail 2.1.5 : Wed Aug 04 2004 - 14:41:13 CDT