RE: Summary: xml:lang validity and RFC 1766 refs to outdated code s

From: Mike Brown (
Date: Tue Aug 08 2000 - 11:59:19 EDT

> > XML 1.0 says that xml:lang attributes must match production 33
> In fact, not so. Productions 33-38 have no normative value
> whatsoever, as there is neither a production nor normative
> language connecting them with the rest of XML 1.0.
> [...]
> In recognition of this fact, official erratum E73 (at
> removes these
> productions from XML 1.0 altogether. It also allows for a
> successor to RFC 1766 when and if such a thing exists.

Correct, but RFC 1766 doesn't, in turn, allow for successors to ISO 639 and
ISO 3166, at least not by a strict interpretation of its formal language.
And to date, there still is no successor to RFC 1766.

E73 says in its rationale "The XML processor does not deal with the value of
xml:lang", but it also says, more formally, "The values of the attribute are
language identifiers as defined by [IETF RFC 1766]".

The use of "are" in that statement sounds as definitive as "must" to me. As
an XML document author, or the programmer of an XML document authoring tool,
tell me, do I or do I not use RFC 1766 language tags/identifiers as xml:lang
values? It seems that XML says I must use them, but it would not a violation
of validity if I didn't use them.

I also don't see how one could read RFC 1766 in such a way as to ignore its
prescription of a finite range of possible values for what it calls a
language tag:

      Language-Tag = Primary-tag *( "-" Subtag )
      Primary-tag = 1*8ALPHA
      Subtag = 1*8ALPHA

    In the primary language tag:

     - All 2-letter tags are interpreted according to ISO
          standard 639, "Code for the representation of names
          of languages" [ISO 639].

     [...mention of "i-" and "x-"...]

     - Other values cannot be assigned except by updating
          this standard. the removal of productions 33-38 from XML really just seem to be
intended to allow RFC 1766 and its successors determine the proper
construction of a language tag, which makes more sense than trying to
reiterate the RFC's technical contents in XML's specification. It doesn't
necessarily follow that xml:lang values can avoid conforming to RFC 1766.

[We're on the same side, here. I'm just playing devil's advocate, because
after I heard about this issue and reviewed the specs myself, I found that
there were indeed points of contention.]


This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:06 EDT