From: Edward H. Trager (firstname.lastname@example.org)
Date: Mon Mar 22 2004 - 19:25:18 EST
On Monday 2004.03.22 16:53:52 -0000, John Snow wrote:
> I work on sales rather than technical so I appologise in advance if this
> is basic!
> I am speaking to a client regarding there website being translated in to
> a number of languages including Bengali, Urdu and Punjabi which I am
> told is not very well supported by Unicode.
> Is this the case?
> What are the potential solutions?
Here is how I would address this problem:
(1) STEP ONE: CREATE SAMPLE WEB PAGES IN TARGET LANGUAGES ENCODED IN UTF-8
The technical staff at your company should create some example web pages
encoded using the Unicode UTF-8 transformation format. The web pages
would of course contain text in each of the target languages.
(2) STEP TWO: TEST THE SAMPLE WEB PAGES IN APPROPRIATE BROWSERS
To wit: on PCs, There are four major web browser "engines":
(1) Internet Explorer
(2) The Gecko Engine used in Mozilla and Netscape
(4) The KHTML Engine used in Safari (Mac OSX) and in KDE Konqueror (Linux)
Don't bother testing older browsers for these languages: just test the latest
version of IE, Mozilla/Netscape, Opera, and Safari/Konqueror.
Also, I would not bother testing Windows OSes prior to Windows 2000/XP.
Also, of course one has to have fonts installed on the client computers. Start
by making sure Microsoft Arial Unicode is installed on the Windows boxes. I'm
not sure what big Unicode font is the "standard" on Mac OSX. Bitstream Cyberbit
Unicode font, being freely downloadable, is a good starting point for Linux.
You will quickly notice, for example, that any rendering bugs that you see
in Safari are going to be almost identical to those seen in Konqueror (Linux
is going to be, if not already, a significant platform for your customers in
India and South Asia, so don't neglect it), and thus you can prove to yourself
that the two browsers do indeed share a common foundation. The same can be
done with Netscape 7.x and Mozilla (and of course the various Mozilla derivatives
like FireFox or whatever they are now calling it...).
Of the four browser engines, you are most likely going to get better results with
Internet Explorer and Mozilla. With regard to Mozilla/Netscape, you are most likely
going to get better results on the Windows platform than on Linux because Mozilla
can use Microsoft's Uniscribe layout engine on Windows, whereas it has to resort to using
the Pango layout engine on Linux (which I believe still lags behind Uniscribe in some
areas related to complex script layout which are quite relevant to your customer's target
languages). On Mac OSX, I'm not sure what layout engine is used.
In any case, testing yourself (or having your company's technical staff do the testing
on your behalf, which achieves the same result but less painfully for you ;-) ) is the only
way you are going to be able to demonstrate what is really going to work or not work for
Should it be the case that W3C-validated, UTF-8-encoded sample pages don't render sufficiently
well on the current crop of browsers, then one can always resort to the old trick of
producing GIF or PNG images of all of the text. I have seen this done much more frequently
than I would have thought necessary on Chinese and Arabic web sites. I would not want
personally to have to go that route (Unicode is a better solution while using images is a
just a workaround that can only be considered a temporary solution).
Write to me offline if you have additional questions about what I have presented above.
-- Ed Trager
> John Snow
> Business Development Director
> +44 (0) 870 990 5166
> ALS Translation Service <http://www.appliedlanguage.com/>
> Translation services for 140 different languages
> ALS Website Translation
> Translation for all file types, formats and technology
This archive was generated by hypermail 2.1.5 : Mon Mar 22 2004 - 19:29:53 EST