>>Lots of features would be nice, including the ability to mark a
>>block of text as being of a given language. Again, my question is:
>>do we really need all these levels of language marking? Why can't we
>>keep this simple, especially since you don't lose anything by keeping
>>language marking at the word/character level?
>Because semantically it makes more sense to say "here is a paragraph
>whose language is Japanese" that to say "Here is a paragraph
>containing a span containing Japanese". HTML is a bit hackish, so I
>guess this argument doesn't carry much weight... ;-)
I have discussed this with Rob in Hong Kong, with the same arguments.
In the meantime, I have found *additional* arguments for having
the LANG attribute on (almost) all elements:
If users know about the lang attribute, we are almost sure that
some of them will write
<H3 LANG=ja>text text text</H3>
<H3><SPAN LANG=ja>text text text</SPAN></H3>
not because they care about semantic integrity and structural
markup or they adhere to the theory that language is a property
of higher level elements and not of characters or whatever,
but just because
a) It's shorter and more convenient
b) It looks natural, with no real reason to disallow it
c) It's difficult to remember where and why LANG is
allowed or not.
So quite some users will try this, and probably quite some browser
makers will put it in because they want to be nice to the user.
And in this case, as opposed to other cases where being nice
to the user meant non-SGML syntax and the like, there is
no reason that we cannot just be nice to the user from the
beginning. It would not be a good sign if we had to change
back to adding LANG on all elements later to "document actual
>>I'm assuming that i18n draft has to be included into the next HTML
>>spec, whatever number it may take on. Currently <LANG> is in the
>>HTML 3.0 draft that I have. I'm fine with dropping <LANG> and adding
>><SPAN>, but only these changes are also reflected in the "non-i18n"
>>part of the standard.
>You should not rely on HTML 3.0. It pioneered many things, but should
>not be looked upon as definitive.
HTML 3.0 was an excellent draft with many ideas, but it became obvious
very quickly that work had to proceed in incremental steps. The HTML
3.0 internet draft therefore disappeared without update after six months.
Those people involved in HTML development knew about this fact, but
in particular book authors didn't realize it. Fortunately, in this respect,
W3C made the bold move of numbering their newest HTML specification
3.2. Although with respect to previous ideas about numbering schemes,
something such as 2.25 would have been more appropriate, this makes
clear that 3.0 is no longer something that exists.
As for the <LANG> vs. <SPAN> issue, those documents where any
character-span markup is needed, in particular stylesheet related
documents, all use <SPAN>, and I think <SPAN> is known well enough
in W3C to assure it gets used on other documents in the future.
Certainly, everybody agrees that a <LANG> element just for language
markup is not a good thing, that the generic container <SPAN> is
>>How do we reach closure on these points.
>VERY good question, especially given the W3C assertion of control of
I have participated in the i18n workshop about internationalization
and in the developper's day session on i18n, and have had many
chances to speak with various people at the recent WWW conference
in Paris (organized by W3C).
Here I can only give my impressions, and this of course cannot
speak for anybody else, but my impression was that the relevant
people at W3C are very much aware of the work of html-wg in
the area of HTML i18n, and that they will try to integrate it as
soon as possible.
---- Dr.sc. Martin J. Du"rst ' , . p y f g c R l / = Institut fu"r Informatik a o e U i D h T n S - der Universita"t Zu"rich ; q j k x b m w v z Winterthurerstrasse 190 (the Dvorak keyboard) CH-8057 Zu"rich-Irchel Tel: +41 1 257 43 16 S w i t z e r l a n d Fax: +41 1 363 00 35 Email: firstname.lastname@example.org $@%F%e!<%k%9%H!&%^!<%F%#%s!&%d%3%V!J%A%e!<%j%C%RBg3X>pJs2J3X2J!K(J ----
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:31 EDT