Re: Multilingual Documents [was: HTML forms and UTF-8]

From: A. Vine (avine@eng.sun.com)
Date: Thu Dec 02 1999 - 15:21:57 EST


Michael Everson wrote:
>
> Ar 18:18 -0800 1999-12-01, scríobh A. Vine:
>
> >The bottom line was then and is now, how much are folks willing to pay for
> >multilingual capability, and how many folks are willing to pay it? Software
> >companies are for-profit organizations. Multilingual support is not trivial.
> >It costs a tremendous amount of money to garner the expertise, evaluate the
> >product(s), design, and code, for multilingual.
>
> Well, Andrea, it depends _how_ multilingual. In Unicode terms, if it has to
> do with representing many complex scripts, that costs lots in terms of
> rendering. But what I don't want to see is a limitation of character
> repertoire, say, in the Latin script.

Neither do I. But the point is that if I serve up UTF-8 or UTF-16 on HTML
pages, most people will not see data outside of ASCII correctly. We're not
there yet. I _can_ serve UTF-8 to folks in HTML pages should they choose UTF-8
as their preferred charset (providing our customers gave their customers that
option). But as it turns out, very few folks actually _do_ choose UTF-8, or
request multilingual capabilities which results in our serving them UTF-8.

I think we're moving in that direction. But there's a lot of old HTML browsers
out there, and as we're trying to reach a broad audience, we do a lot of
conversion from our internal Unicode into other charsets.

>
> What Asmus said is an example of the problem:
>
> >> 4) Most of my (European) newspapers easily cross alphabet boundaries
> >> (e.g. use of correct Latin-2 accents is common in Latin-1 languages).
>
> This is 8-bit limited-character-set thinking, and it worries me, because if
> we go dipping down to the bottom line of "Latin letters people pay for"
> then the users of less economically powerful languages will be left out in
> the cold. As they have been for far too long.

It's not that people are paying per character. But multilingual capability,
even in a Unicode situation, costs more than monolingual.

Andrea



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:56 EDT