Re: HTML Multilanguage Support

From: Otto Stolz (Otto.Stolz@uni-konstanz.de)
Date: Thu Jul 01 1999 - 14:15:14 EDT


Am 1999-06-29 um 12:58 h PMT hat sjohnson@mstc.state.ms.us geschrieben:
> How would one apply the unicode standards in a HTML.

Cf. <http://www.w3.org/TR/REC-html40/charset.html>, in particular
<http://www.w3.org/TR/REC-html40/charset.html#h-5.2.2>: according
to the HTML 4.0 definition, you may choose the most convenient code
page for your HTML page (which is perceived as a transport vehicle
only) and use entities, such as "&euro;", "&#8364;", or "&x20AC",
for those characters that are not in the code page chosen for the
transfer. In practice, however, this does only work well when you
choose UTF-8 as your transfer encoding; then you won't need to resort
to numerical character references, of course. Example of a UTF-8
based page: <http://www.reuters.com/unicode/iuc10/x-utf8.html>.

This scheme is defined only for HTML 4.0, so you will need to mark your
document as a HTML 4.0 ducoment, cf.
<http://www.w3.org/TR/REC-html40/struct/global.html#h-7.2>.
(The example page cited above does, however, not comprise this
mandantory declaration.)

I also recommend to tag the various parts of your HTML page with their
respective languages,
cf. <http://www.w3.org/TR/REC-html40/struct/dirlang.html>, in particular
<http://www.w3.org/TR/REC-html40/struct/dirlang.html#h-8.1.1> combined
with <http://sunsite.auc.dk/RFC/rfc/rfc1766.html>,
<http://userpage.chemie.fu-berlin.de/diverse/doc/ISO_639.html> and
<http://userpage.chemie.fu-berlin.de/diverse/doc/ISO_3166.html>.

For more info on HTML i18n, cf. <http://www.w3.org/International/>.

For more HTML recommendations, hints, and caveats,
cf. <http://www.w3.org/TR/REC-html40/> and <http://validator.w3.org/>;
<http://www.w3.org/MarkUp/#guidelines>,
<http://www.w3.org/WAI/GL/#Current_Draft> and <http://www.cast.org/bobby/>.

Best wishes,
   Otto Stolz



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:47 EDT