Am 1999-10-22 um 1:36 h hat Denice Szafran Liscomb <email@example.com>
> how to code [characters from the Latin Extended A range] on an HTML or XML
> page to make the characters appear properly.
I can only give advice for HTML. I have sent most of this to the Unicode
List, back in July.
> Where do I find this information?
Cf. <http://www.w3.org/TR/REC-html40/charset.html>, in particular
to the HTML 4.0 definition, you may choose the most convenient code
page (which is perceived as a transport vehicle only) for your HTML
page and use entities, such as "€", "€", or "&x20AC",
for those characters that are not in the code page chosen for the
transfer. In practice, however, this does work well *only* when you
choose UTF-8 as your transfer encoding; then you won't need to resort
to numerical character references, of course (but you are free to use
them if they convene to you). Examples of UTF-8 based pages:
and the attached file.
This scheme is defined only for HTML 4.0, so you will need to mark your
document as a HTML 4.0 document, cf.
(The example page from Reuters does, however, not comprise this
You cannot legally include Latin-2 characters in pre-4.0 HTML, as HTML 3.2
mandates Latin-1. In HTML 4, you must specify any encoding other than
Latin-1; so your Latin-2, or UTF-8, encoded HTML pages must either be
sent with an appropriate HTTP header field, or they must contain a Meta
tag, as discussed above.
I also recommend to tag the various parts of your HTML page with their
cf. <http://www.w3.org/TR/REC-html40/struct/dirlang.html>, in particular
For more info on HTML i18n, cf. <http://www.w3.org/International/>.
You may also wish to read other parts of the HTML 4.0 specification,
and hints for HTML authors:
and to test your HTML source against pertinent validation services:
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:54 EDT