From: verdy_p (verdy_p@wanadoo.fr)
Date: Wed Aug 04 2010 - 16:10:56 CDT

  • Next message: Asmus Freytag: "Re: Standard fallback characters (was: Draft Proposal to add Variation A Sequences for Latin and Cyrillic letters)"

    "Doug Ewell" wrote:
    > There is no "formal model" in the sense of a standard N-letter subtag
    > for dialects, because the concept of a dialect is too open-ended and
    > unsystematic. The word means different things to different people.
    > What may be a dialect to one person might be a full-blown National
    > Language to another, or just a funny accent to a third.

    The formal model already exists in ISO 639, that has decided to unify all dialectal variants under the same language
    code. Yes the concept is fuzzy, but as long as ISO 639 will not contain a formal model for how the various languages
    are grouped in families and subfamilies, it will be impossible to use dialectal variant specifiers with accurate
    fallbacks, without using subtags for the language variants.

    One know problem is for exampel Norman, which ISO 639 still considers as a dialect of French, even though it is just
    ANOTHER Oil language (from which Standard French emerged by merging, modifying and extending several dialects).

    But Jersiais is now an language with official in Jersey, which is clearly part of the Norman family. And that still
    needs to be distinguished from French. Still, there's no ISO 639 code for Norman (as a family or as the residual
    language in continentla Normandy in France), and no code for Jersiais as well. And French is considered in ISO 639
    as an "isolated" language, not as as "macrolanguage". So it allows no further precision.

    If something is added, it can only be a variant for the "dialectal" difference, such as "fr-norman" for the Norman
    family, or "fr-jersiais" for Jersiais, unless Jersiais gets its own ISO 639-3 code as an isolated language (leaving
    the continental Norman still as a dialectal variant of French).

    The "formal definition" of languages is the definition of ISO 639-3 "isolated" languages. Everything below is
    dialectal (and ISO 639 has clearly stated that it planned for much later a comprehensive encoding of dialectal
    differences, most probably by defining a standard list of "variant" codes, even if these dialects may qualify as
    "languages" for some users)

    It's remarkable that for most linguists, Serbian, Croatian, annd Bosnian are only one language, with only dialectal 
    differences (in the spoken language and with some grammatical derivations, and some minor lexical differences that 
    are understood by all Serbo-Croatian speakers), orthographic differences (mostly based on their default script, even 
    if Serbian still uses the two scripts but it defines a strict transliteration system that helps defining a unified 
    orthography for both scripts, orthographies that are simplified in Croatian and Bosnian).
    So yes, the concept of dialects vs. language is fuzzy for linguists and users (and nationals that prefer to see 
    their dialect named from their country as a full language instead of a dialect), but ISO 639 defines a formal model 
    by its technical encoding: if there's an authority defending the position of a distinct language and defining an 
    official lexique and orthography, it becomes a "de facto" language for ISO 639.
    Such split of languages in their dialectal differences promoted to isolated languages has occured and was endorsed 
    by ISO 639, even if it was probably not in the interest of these countries to split their common language and to 
    reduce its audience and cultural influence in other parts of the world (and for many of their own citizens, they 
    won't care a lot about these formal official differences, as long as they understand it and can read and write it in 
    a script that they can decipher it without difficulties, only because they will constantly live near other peoples 
    sharing the same language but under a different name).
    Serbian is still perceived and encoded as a single language, despite it still uses two scripts, depending on the 
    region of use (but it is now rapidly converging to the Latin script). May be the linguistic and cultural authorities 
    of the four concerned countries (or five, now with Kosovo whose independance was recently validated by a 
    international court?) will decide to reunite their cultural efforts, if they finally all use the same Latin script, 
    by adopting a new neutral name (Dolmoslavic, Adriatic, Adrislavic ? Or even Yugoslavic ?) and increasing their 
    mutual cultural exchanges instead of wasting them for old nationalist reasons (this will be even more important when 
    they will finally ALL join the European Union with increased exchanged between them).

    This archive was generated by hypermail 2.1.5 : Wed Aug 04 2010 - 16:12:25 CDT