Re: transforms and language identifiers (was Re: Dozenal chars in music)

From: Mark Davis (mark.edward.davis@gmail.com)
Date: Wed May 27 2009 - 16:59:20 CDT

  • Next message: Mark Davis: "Re: transforms and language identifiers (was Re: Dozenal chars in music)"

    Mark

    On Wed, May 27, 2009 at 00:24, Michael Everson <everson@evertype.com> wrote:

    > Mark,
    >
    > On 27 May 2009, at 02:35, Mark Davis wrote:
    >
    >>
    >> The API does not actually do that. The API actually returns precisely
    >> which one was chosen, so the user has a choice, as I said, of discarding the
    >> transform, or using it. So you can ask for "en_GB-ipa". We don't have one
    >> available currently, so you would get back "en-ipa". According to the CLDR
    >> data, mechanically readable, that is equivalent to *a* en_US-ipa transform.
    >> You can at that point simply reject it, and tell your user it there is
    >> nothing available, if you judge that it is better to fail than to return a
    >> different variant of English than you want.
    >>
    >
    > It's en_US-fonipa, not en_US-ipa

    I don't think you've been reading all the messages in this thead; I've
    explained multiple times that the transform IDs are not BCP-47 language
    tags. They are related, but not the same.

    >
    > Of course. But most en-UK speakers accept RP as a reference standard
    >>> pronunciation, although they no longer consider it a normative standard.
    >>> Likewise people accept GA as an American reference standard, not a normative
    >>> standard.
    >>>
    >>
    >> I can't speak to the former, but as to the latter; I don't know that the
    >> average non-GA American would necessarily consider it a "the reference
    >> standard".
    >>
    >
    > To the degree that he or she could understand the question, yes, Johnny
    > Carson's Nebraskan "GA" would be considered so.

    By Nebraskans, yes. By Texans, maybe not.

    >
    > I think it's not entirely clear whether UK or US English is viewed as the
    >>> reference standard for English, if you're only interested in numbers. US
    >>> clearly dominates the native-English-speaking world, but probably many of
    >>> the L2 English speakers still think of UK English as a nearer reference
    >>> standard than US, especially in those places where there are many L2
    >>> speakers.
    >>>
    >>
    >> If you have some hard figures on that it would be useful to consider them.
    >>
    >
    > Julian has to provide figures but you can just make assertions? Hm.

    No reason to be snide, especially when you haven't read all the emails. I
    did point to figures.

    >
    >
    > But a GA transcription has less information than an RP transcription, so
    >>> can't be transformed to be right for RP. Similarly, such an transcription
    >>> should include all the /r/s, even those that non-rhotic speakers (e.g. RP)
    >>> don't pronounce, because non-rhotic speakers can remove the /r/s, but rhotic
    >>> speakers can't insert /r/s that aren't there in the transcription.
    >>>
    >>
    >>
    >> I don't know that that is really the case.
    >>
    >
    > What, you think that rhotic speakers can insert /r/s that aren't in the
    > transcription? Wrong. /bɑː/ may mean 'the sound a sheep makes' or 'a place
    > to get a drink'. There is no information there which will tell which word
    > gets an /r/ and which does not. On the other hand if /bɑː/ and /bɑːr/ are
    > written, the non-rhotic speaker can delete the for-him-or-her superfluous
    > /r/. Julian is right.

    If you read it over carefully, including the portions you omitted from the
    message, you'll see that I was objecting to his statement that "you can
    convert from RP to GA pretty well"

    What is probably true is that if you had a specially marked up IPA that took
    RP and enhanced it by changing to rhotic, and further enhanced by making
    some of the other distinctions that the various variants of English make,
    that you could probably map from that specially marked up to either RP or GA
    and get somewhat accurate results. But that is a far different (and bigger)
    project.

    >
    > And you can't reliably transform from RP to GA (or the reverse). See, for
    >> example, http://www.spellingsociety.org/journals/j3/accents.php and
    >> http://www.phon.ucl.ac.uk/home/wells/phoneticsymbolsforenglish.htm
    >>
    >> Wells has, for example,
    >>
    >> ɑː
    >> start, father
    >> ɔː
    >> thought, law, north, war
    >>
    >> You couldn't map those reliably to GA, because some are rhotic and some
    >> are not. And there are some cases that are even clearer, like "privacy".
    >>
    >
    > Well, not so much, because even within GA you have /ˌɛkəˈnɔmɪk/ alongside
    > /ˌikəˈnɔmɪk/. The edge-case does not refute the argument.
    >
    > Anyway what Julian was saying was that rhotic transcriptions offer more
    > information than non-rhotic. Did you know that there are many (many)
    > dialects of European English which are rhotic?

    Yes. See above.

    >
    >
    > In fact, your system already does some of this: it tranforms
    >>> When will Merry Mary marry?
    >>> to
    >>> wɛn wɪl mɛri meri mæri?
    >>> although most Americans don't make the three-way distinction.
    >>>
    >>
    >
    > I always did and still do. (Eastern Pennsylvania.) Though my brother has
    > lost the distinction between the first two (I blame the dialects he was
    > exposed to in the Navy.)
    >
    > I have /ʍɛn/ in stressed position too. :-) That's gone throughout Britain
    > apart from Scotland, though it is common enough in much if not most of
    > Ireland.
    >
    > All this kind of stuff has of course been considered ad nauseam in the
    >>> various proposals for "phonetic" orthographies for English.
    >>>
    >>
    > That's why the most sensible arguments for reform are the ones like
    > Webster's (which was not very systematic) or Wijk's (whose "hoote" vs.
    > "foot" makes very good sense indeed).
    >
    > Michael Everson * http://www.evertype.com/
    >
    >
    >
    >



    This archive was generated by hypermail 2.1.5 : Wed May 27 2009 - 17:02:36 CDT