On Mon, 7 Aug 2000, Mike Brown wrote:
> It has been argued that strict interpretations of RFC 1766 would have the
> effect of requiring values for XML 1.0's xml:lang attribute and HTML 4.01's
> lang attribute to be created from outdated language and country code lists.
[snip]
> XML 1.0 says that xml:lang attributes must match production 33 for
> well-formedness -- on that all seem to agree.
In fact, not so. Productions 33-38 have no normative value whatsoever,
as there is neither a production nor normative language connecting them
with the rest of XML 1.0. The following document is both well-formed
and valid:
<!DOCTYPE root [
<!ELEMENT root EMPTY>
<!ATTLIST root
xml:lang CDATA "">
]>
<root xml:lang="foo%bar">
even though "foo%bar" is not a valid language tag.
In recognition of this fact, official erratum E73 (at
http://www.w3.org/XML/xml-19980210-errata#E73) removes these productions
from XML 1.0 altogether. It also allows for a successor to RFC 1766
when and if such a thing exists.
> There still remains the unclear issue of whether xml:lang validity really
> should correlate to strict RFC 1766 conformance, down to the selection of
> language codes from ISO 639-1.
It does not. There is no validity constraint prescribing it.
> Regardless, in either case it does not seem unreasonable, especially in
> light of Harald's clarification, to expect that if a validating XML parser
> checks the 2-letter language code portion of an xml:lang value against an
> ISO 639 list, then it will use the most current list available to it.
A validating parser may do so, but it has no warrant for reporting a
validity error if the language code is not on some list.
-- John Cowan cowan@ccil.org C'est la` pourtant que se livre le sens du dire, de ce que, s'y conjuguant le nyania qui bruit des sexes en compagnie, il supplee a ce qu'entre eux, de rapport nyait pas. -- Jacques Lacan, "L'Etourdit"
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:06 EDT