Re: Markup for Language (was: Re: Exemplifying apostrophes)

From: Behnam (
Date: Thu May 29 2008 - 13:07:02 CDT

    On 29-May-08, at 11:32 AM, Phillips, Addison wrote:

    > Language identification can be applied at many levels to a
    > document. It can certainly be applied to a string of characters. It
    > can also be usefully applied to sentences, paragraphs, chapters,
    > sections, entire documents, and even collections of documents. (And
    > a document need not be written--sound recordings, for example,
    > often use language).
    > There are at least two types of language identification (see [1]).
    > For the kind you mean here, language identification can work at any
    > appropriate level of granularity. This email, for example, is
    > entirely in English. This is no point to marking up every single
    > sentence, line, word, or character with a language tag when the
    > Content-Language header for the whole thing does the job nicely.
    > Certainly a span of text can be in another language and should be
    > appropriately tagged. But over-tagging increases complexity and
    > burns bandwidth/storage to no good effect. Or, as we say in
    > language tagging land, "Tag Content Wisely".
    > Best Regards,
    > Addison
    > [1]

    Yes Addison thanks. This is what initially Ken Whistler pointed out.
    Exporting the document to html seems to be the best available option.
    I'll work on it.


