Re: Designing a multilingual web site

From: Michael \(michka\) Kaplan (michka@trigeminal.com)
Date: Sun Jul 16 2000 - 16:22:01 EDT


From: "Munzir Taha" <munzir_taha@yahoo.com>

>>The updates on the Unicode site are being done
>>for all new pages and for the rest as they have time.
> Why it needs time? Why is it more than just replacing
> windows-1252 with utf-8?!

A question best asked to the people who maintain the site.... although I
never would be presumptuous to assume that someone else must do work. I
think it would be perfectly acceptable if they did not encode as UTF-8 pages
that do not require it.

> I can't understand the point. Sorry for that but I am
> not an English native speaker, especially if there is a
> misspelled word ;-)

My point was that I give my English HTML page to a localizer, and she
returns it to me as (for example) a Simplified Chinese page using the gb2312
charset tag. If I can use what she gives me, unchanged, than this is better,
in my opinion. If I have to manage 12-70 languages, is it not better to make
it as easy for me as possible? :-)

>>I work almost exclusively in HTML view and
>>occasionally *Preview* view (NOT normal
>>view!!!). But in doing this I not only have support
>>for displaying languages with Arabic scripts, but
>>typing them as well
> But even in HTML view I can't type arabic in the
> banner text since they are saved in different place.

I do not use banners, maybe that is why I do not have a problem?

>>Unicode = UTF-16le
>>Unicode (Big Endian) = UTF-16be
>>UTF-8 = UTF-8
>Do you really mean these three variations. and what
>le and be stands for? I have tried to save an arabic
>document in utf-8, unicode, unicode (big endian) and
> all of them saves correctly! What this means?

This is correct. LE stands for little endian and BE stands for big endian
(they have to do with the ordering of bytes). What it means is that all
three are separate encodings of the exact same standard. You should expect
(demand?) lossless conversion between them. UTF-8 is the only one of the
formats you can use in FrontPage 2000 (I cannot speak for the next version
of FP since it is not yet shipped!)

>When I recieve a message in arabic in Outlook, I
>can change the encoding to Arabic (Windows) to view
>it correctly. But still the subject is wrong. How can I
>view it correctly.

Well, this is getting a bit out of my area since I do not use Outlook (I use
Outlook Express and Exchange 5.0 for e-mail) and it is of course very off
topic for the Unicode mailing list. But if you mean the dialog caption, then
this is by design in Windows: the system supplies the caption and its up to
the system to display what it can.

MichKa

random junk of dubious value at the multilingual
http://www.trigeminal.com/ and a new book on
i18N in VB at http://www.trigeminal.com/michka.asp



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:05 EDT