Re: UTF-8 in web pages

From: schererm@us.ibm.com
Date: Fri Feb 05 1999 - 13:37:01 EST


current versions of internet explorer, netscape, and lynx all support
unicode encodings.
unicode is _the_ html character set since version 3.2, i.e., all unicode
characters are supported by html. for example, (hexa)decimal numbers in
character entities are resolved as unicode code points.
the default charset is still iso 8859-1 - which is a subset of unicode,
code-point-wise.
i guess you know
        <meta http-equiv="Content-Type" Content="text/html; charset=utf-8">

the xml standard requires that clients are able to handle utf-8 and utf-16.

best regards,

markus

Markus Scherer IBM RTP +1 919 486 1135 Dept. Fax +1 919 254 6430
schererm@us.ibm.com
                        Unicode is here! --> http://www.unicode.org/

"John O'Conner" <joconner@geocities.com> on 99-02-05 12:15:33

To: Unicode List <unicode@unicode.org>
Subject: UTF-8 in web pages

I have a client that has a requirement to support several
languages on their website and e-commerce store. I want to
help them manage the storage of information and dynamic web
pages by suggesting a common character set for all
languages...Unicode.

It seems like a no-brainer to select Unicode for my database
character set because of their multi-language needs.
However, I'm concerned about Unicode in web pages. I have
browsed several UTF-8 pages with success, but I notice that
the industry hasn't really picked up on UTF-8 as an HTML
content encoding. Do any of you have any success/failure
stories that you can share? How comfortable would you be
recommending UTF-8 for HTML content. Oh, here's one more
piece of information...the customer has traditionally used
Big 5 for all their encoding needs. Actually...they've used
an extension for their special chars in Hong Kong that don't
seem to be available in Big 5.

Regards,
John O'Conner



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:44 EDT