"Addison Phillips [GSC]" wrote:

> They have that odd UTF-5 proposal to be compatible with existing software.


> Supposedly IETF has a working group, but I know nothing about it. I do know
> that there was a recent expansion of legal domain names to 63 bytes (which
> is almost 3x the old limit... or about what one would expect to accommodate
> the BMP in UTF-8........)

Domain labels have always been 63 bytes, a limit imposed by the length encoding,
which provides 6 bits. And domain names can include any byte whatever,
according to RFC 1035 section 3.1 (the BNF in section 2.3.1 is merely
informative, despite the inappropriate presence of "must" in the last paragraph).
The only restriction is that ASCII letters be treated case-blind.
In practice, 0x2E (".") is also reserved.

The effective restrictions are imposed by RFC 1122 (STD 3) and its normative
reference to RFC 952, which imposes the restriction to ASCII letters, digits, and
"-", where the latter character cannot appear at the beginning or the end.
IMHO this restriction is obsolete and should be superseded.


