Re: UTF-8 code in HTML

From: Mark Davis (markdavis@ispchannel.com)
Date: Sat Apr 15 2000 - 16:51:14 EDT


One thought:

1. Make a simple web page explaining how to set up different browsers with the right fonts to read UTF-8.
2. Make a button-like GIF that says something like "Display Problems?" with a link to the page.
3. Get volunteers to translate this page and the text in the GIF into multiple languages.
4. Post the pages and GIFs on the Unicode site in an accessible area.
5. Encourage people to use the linked GIFs on their own sites, and/or copy them and modify as they see fit.

Do you think this kind of thing would help?

Mark

Glen Perkins wrote:

> I wonder how big a problem a typical large corporation would actually face
> if they switched from the current "legacy encodings" in each world market to
> UTF-8. I'm not wondering if there would be a problem, yes or no, from a
> purist perspective. I mean what are the numbers, market by market, of people
> who would have problems with UTF-8 vs. the numbers of people who have
> problems caused by the current encodings, weighted by the seriousness of
> those problems.
>
> For example, how big is the risk of using UTF-8 for the US market? It's
> seems as though it's probably a little riskier than Latin-1, but is it
> really? How much riskier? By "how much", I mean what percentage of visitors
> to the site would have a problem with UTF-8 vs. what percentage would have a
> problem with Latin-1. It's not as if there are no Latin-1 problems, after
> all. If you build a "Latin-1" app server, people will immediately start
> shoving CP1252 curly quotes, trademark signs, etc. into it, which will
> probably break when served to a Mac or Unix box.
>
> Then, what percentage of the French market would have trouble with UTF-8 vs
> Latin-1? You have similar CP1252 problems, plus the Euro issue. What
> percentage of browsers would have problems with a well-built UTF-8 page *in
> French*, given the actual installed base of browsers in France today?
>
> What of the Polish market? Addison, your point is well taken about the
> browsers of non-Polish-OS users needing special setup for viewing UTF-8
> encoded Polish text, but for a realistic market analysis, that may not
> matter very much. More likely, the questions would be, how would a UTF-8
> encoded web page fare in the actual Polish market and, again realistically,
> even if it had some problems, how much would it matter, given that the
> Polish market is likely to comprise only a very small percentage of your
> worldwide viewership. I believe that with a Polish OS and a reasonably
> recent browser in default configuration, UTF-8 would work fine. (Correct me
> if I'm wrong.) Then, the question would be, how many Polish speaking users
> of non-Polish OSes are there, and would you be targeting them anyway? In
> fact, would you have Polish content on your website at all if you couldn't
> just piggyback on the app server's UTF-8 infrastructure built for other
> markets? After all, your US viewers who still browse with Netscape 1.0 (or
> Lynx) may outnumber all of your Polish viewers, and you probably don't make
> major design decisions based on the needs of Netscape 1.0 or Lynx users. And
> if the Polish user isn't using a Polish OS because he works in Germany, for
> example, then maybe your German pages are actually more applicable to him
> anyway. If so, then the browser's failure to handle Polish in UTF-8 by
> default probably won't matter.
>
> Then there's Japan. Now here's where it appears that a native speaking
> Japanese using a Japanese OS and Netscape 4.x in its default state will be
> unable to render Japanese encoded in UTF-8 because the default font for
> UTF-8 is a western font. Jungshik is saying that Netscape does "font
> switching", if I'm understanding him correctly, which should obviate this
> problem, but it was my understanding that this didn't become a feature until
> Mozilla. Maybe it's true of Netscape 4.7, but I thought all Netscape 4.x's
> in Japan had a "one default font per encoding" limitation, and that a
> non-Japanese font was made the default for Japanese Netscape 4.x.
>
> Well, Japan's a big market, so the question then becomes what percentage of
> Japanese viewers would have trouble viewing Japanese (not Polish) in UTF-8
> vs. what percentage would have various troubles viewing, say, EUC-JP. Not,
> "does the problem exist", but to what extent, and how fast is it
> disappearing.
>
> Then Korea, Taiwan, China, etc. Are they the same as Japan? By that I mean
> 1) in the behavior of the browsers, 2) in the current market share of the
> browsers, and 3) (hard to measure but important) how much do those users of
> UTF-8-challenged browsers really correspond to your target market anyway?
>
> I'd be interested to know if anyone has more detailed info on just what the
> current magnitude of the problem would be per market. (Not just standard
> browser stats, but the degree to which UTF-8 would work for the native
> language on each browser used in that locale/market.) I'd like to see
> statistics of this sort tracked on the home page of Unicode.org. We could
> have projections of milestones that would certainly make for good PR:
> "According to the Unicode Consortium, the percentage of browsers worldwide
> that are unable to handle a Unicode page in the user's native language will
> drop below 5% by mid-August...." It may be that the benefits are going to
> outweigh the remaining problems sooner than even we realize. Since a lot of
> people rely on the folks on this list to tell them when UTF-8 is "safe to
> use", we ought to keep our eye on these numbers. We don't want the idea that
> "the market isn't really ready for UTF-8 to the browser" to become
> fossilized as conventional wisdom, carved in stone like "Unicode doubles the
> size of all text data", independent of changing market statistics.
>
> __Glen Perkins__
>
> ----- Original
> Message ----------------------------------------------------------
> From: Addison Phillips [GSC]
>
> It's exciting that we're on the cusp of general support for Unicode in the
> browsers, operating systems, and languages (perl just got a transfusion, for
> example)... and the support will "just be there" without thinking. About
> time.
>
> Best regards,
>
> Addison



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:02 EDT