Re: Comments on <draft-ietf-acap-mlsf-00.txt>

From: Martin J. Duerst (
Date: Wed Jun 11 1997 - 09:04:47 EDT

On Tue, 10 Jun 1997, Chris Newman wrote:

> While I can envision a number of good reasons for fixed-width process
> code, I'm skeptical that UTF-16 counts as a fixed-width process code and I
> see no evidence that UCS-4 is used by the industry.

UCS-4, or otherwise four-byte character processing code, is used by
Silicon Graphics.

> Langauge alternatives have very different semantics from multi-valued
> attributes. The issues are orthogonal.

Can you explain that some more? There are many Chinese, in particular
in Hong Kong, which have adopted English-style first names. Some
consider them nicknames (which I guess would become a different attribute
or a multiple attribute), some consider them part of their full name,
in some sense (based on their usage when talking) they are language
alternatives. How do you make these things orthogonal?

> > Metadata would make the best place for language information, wouldn't it?
> It would be fine if every attribute contains one and only one langauge.
> I'm not sure it's correct to presume that both alternative language values
> and mixed-language values will never ever be needed.

I'm not sure either, not at all. Also, I'm not sure whether attributes
will always stay unstructured in other respects. For all those many
designers of application profiles (datasets, attribute collections,...),
the current solution, in my eyes, seems to give the message: It's
okay to have long and complicated attribute values, it's nice if
they have structure, use some notation to denote alternatives,...
I don't know if this is want you want.

> Maybe. On the other hand, mixed language error strings are very likely
> to occur. What about errors where part of the message comes from a
> plug-in or module which doesn't support the client's preferred language?

Is this error strings sent from the ACAP server, or stored as data?
In the former, I don't see much of a need for plugin modules (maybe
with the exception of comparison/sorting/searching functions) in
ACAP server implementation. On the client side, of course, this is
much different, but that's not ACAP's concern, this is only between
the ACAP client and the user.

> > What remains is the language of the Alert and Warning messages.
> > For this, the correct solution is language negotiation, i.e.
> > the client telling the server about the languages preferred by
> > the user, and the server telling the client about the language
> > it will use. Alternates in this context are not a solution,
> > because they don't scale.
> What about mixed language error messages, and alert messages which aren't
> available in the client's preferred language? I agree that regardless of
> the solution chosen, the client will need to express a preferred language
> for error text.

I was thinking about writing an I-D for defining such a thing
(together with the response of the server), written so that it
could be easily used/adapted in various IETF protocols that need
it. Any comments/interest/showstoppers/coauthors?

> > As I have said, I'm in no way against language tagging.
> > But it should be done by considering the structure and
> > the needs of the protocol.
> The problem is that *every* human-readable string should be labelled with
> an RFC 1766 natural language according the the IAB charset workshop
> recommendations.

Others have cited enough text from RFC 2130 to show that this is
not true in this generality, even if we all agree that language
information is very useful and desirable to have around.

And as an aside: There are things for which it is rather
easy to come to an agreement that they don't need or can't
use language tags, even if they are very much intended to
be human readable. Interestingly, these are things that
are not only handled by computers, but also have to be
dealt with on paper.

Both "charset" character encoding information and explicit
language information is very clumsy to handle on paper.
Because of the first, we have both agreed to use UTF-8
as a turning point for internationalized URLs.
Because of the second, we might both be able to agree
that we have to do without explicit language information
in URLs, even if we likewise agree that it would be nice
to have e.g. for text-to-speech conversion and such.

> It seems technically illogical to have to add complexity
> to every single protocol to carry the tags out of band from the human
> readable strings.

I fully agree. But language is an additional dimension in many
contexts. HTTP without language negotiation, just relying on
language tags of whatever from, would be useless. The same
applies in many other contexts. Protocol developpers have to
seriously consider the relation between language and other
things, and find the best solution. For ACAP, these seems
to have been a lot of discussion (haven't had time yet to
look at the archives), and whether and why I as an individual
agree with the results of this discussion is rather irrelevant.

But one of the dangers of MLSF, or similar tagging formats,
is that some other group thinks "we need language information,
because of RFC 2130, why not just take the tags from ACAP, and
we are done. The consequence of this may be that that protocol
doesn't really discuss the needs for language tags in the
corresponding context, and the interaction with other protocol

> Why not just create a format for "human readable strings
> which meets the IAB charset workshop recommendations" and solve the
> problem for all protocols? I think it's quite clear that solving this at
> the protocol level is the wrong level from an architectural standpoint.

For many things, it isn't the wrong level. And if it is the wrong
level, we may want designers to understand it's the wrong level, and
if it's the right level, we want designers to become aware that
it's the right level.

Regards, Martin.

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:34 EDT