Re: Plane 14 language tags

From: Kenneth Whistler (
Date: Wed Jun 28 2000 - 15:10:05 EDT

Doug Ewell asked:

> 2. (Ken and Glenn) Can you explain in a little more detail the rationale
> for lowercasing the entire language tag? It seems that if RFC 1766
> is the model to be followed, then the RFC 1766 casing convention
> (lowercase for language, uppercase for country) might be preferred.

John Cowan's non-authoritative response was fine by me -- and was
better-expressed than this author would probably have done. ;-)

> I guess I don't see how lowercasing the entire tag simplifies or
> speeds up anything, since the hyphen which separates language from
> country is outside the range of lowercase letters anyway and
> processes that want to ignore LT's must ignore the entire range from
> U+E0000 through U+E007F.

It is not a matter of range-checking. For ignoring tags, you would always
check the entire range. Rather, it is just a suggestion that since
case is not significant in the language tags, it is slightly preferable
to do the early "normalization" (i.e. case folding to lowercase, in
this instance), rather than emitting arbitrarily mixed case tags
and distributing the case-folding burden to all the interpreters of
the tags.

--Ken Whistler

