Re: Looking for transcription or transliteration standards latin->arabic

From: busmanus (
Date: Thu Jul 08 2004 - 15:27:05 CDT

  • Next message: Donald Z. Osborn: "Re: alphabetic sorting of IPA and other derived letters"

    You will need a Unicode font with Central-European an IPA
    characters to read my examples.

    Mike Ayers wrote:
    > > Perhaps it is. But then it's partly due to the "lazy" tradition.
    > Are you implying that, had printers throughout the centuries put
    > the effort into faithfully reproducing every obscure symbol from every
    > foreign language, that the modern American would accept words with
    > arbitrary diacritics?

    I do not pretend to know, but "accept" is probably not the best word
    to use in this context, after all it's not about the spelling of
    English words. And not every tradition needs to be hundreds of
    years old.

    > > I don't think it's a problem with any given diacritical. Its rather
    > > an indistinct horror of diacriticals in general in speakers of a
    > > language without any diacriticals at all, like English. E.g.
    > > Hungarian uses three diacriticals and Hungarian speakers make no
    > > big deal of just ignoring the "meaningless" caron in Czech or
    > > the grave
    > > and the cedilla in Roumanian names.
    > > On the other hand, I must admit, that we also can be quite brutal
    > > to diacriticals in some newspapers or when it comes to a language
    > > like Vietnamese...
    > In other words, you're pretty comfortable with your own
    > diacritics. You make my point for me.

    "Our own" are the acute (to show vowel length), the diaeresis
    (to show timbre, like in German) and the doubleacute (=a "stretched
    diaeresis" actually, to show both timbre and length at the same
    time). The caron or the cedilla are just as foreign for us as e.g.
    the odd "question marks" above Vietnamese vowels, even if they
    may be less unusual. And the case of the newpapers I'm talking about
    may be just classic examples of lazy typography, at least the silly
    spelling mistakes and other inaccuracies they allow themselves point
    in that direction. In books by any serious publisher, it would
    definitely be completely unacceptable to write e.g. Hašek's name
    (a famous Czech satyrist) as Hasek.

    Once we got into this debate, let me quote an example where
    distinguishing between diacritics as "familiar" and "unfamiliar" may
    lead to undesirable results. Imagine, someone writes an article about
    a person named Törőcsik [tørøːʧik] (we accidentally have an actress
    by that surname). Suppose the journalist thinks it reasonable to retain
    the "familiar" diaeresis, because it is found in German and many other
    well-known orthographies. But what should be the fate of the
    doubleacute (which is actually nothing but a special kind of diaeresis,
    as I mentioned above)? As an "unfamiliar" diacritic, it should be
    discarded if the principle is applied mechanically. This would result
    in the form "Törocsik" [tøroʧik]; however, as you may see from the
    phonetic transcription, this is not simply incomplete information
    in such a context, but explicit misinformation. The less cruel
    approach would be to replace the "special diaeresis" with the "normal"
    one and write "Töröcsik" [tørøʧik]. This is undoubtedly the least
    unacceptable of the "diacritic-folded" variants mathematically possible,
    but it is neither a proper English transcription because of the
    diaereses and the unusual value of the consonant cluster "cs", nor
    correct Hungarian because of "denying" the long vowel, so what is it
    after all?

    There may not be an easy way to solve sucht situations, so that
    everybody would be pleased, but at least thinking about them does no
    harm. Sorry for being so long, perhaps someone finds my data


    Miert fizetsz az internetert? Korlatlan, ingyenes internet hozzaferes a FreeStarttol.
    Probald ki most!

    This archive was generated by hypermail 2.1.5 : Thu Jul 08 2004 - 15:16:52 CDT