Re: Unicode Support in Java

From: Martin J Duerst (mduerst@ifi.unizh.ch)
Date: Mon Mar 25 1996 - 15:24:24 EST


Michael Kung wrote:

>On Mar 25, 5:40pm, Martin J Duerst wrote:
>> Subject: Re: Unicode Support in Java
>> Michael Kung wrote:

>> >You need to check what is your Language Encoding in Option is set.

>>
>> The encoding option (assumed MIME "charset" parameter in HTTP header
>> in the absence of such a parameter) is not directly related to this.
>> Even if you set that to ISO-2022-JP (a Japanese encoding), that does
>> not mean that Java then will run with Japanese. If it in fact does,
>> that would mean that Netscape does not comply to the Java specifications.
>> For Java, you don't need any kind of encoding opition setting, as everything
>> there is (supposed to be) Unicode.
>> Of course, if Netscape offered Unicode, or UTF-8, as an encoding option,
>> chances would be high that they also support Unicode in Java, but this
>> is the only relation between these two things.
>>
>> Regards, Martin.
>>-- End of excerpt from Martin J Duerst
>
>Noooo. I am not talking about the HTTP MIME heading. The Language Option is
>the conversion that Netscape 2.0 will perform for the incoming encoding to the
>current runtime codeset.
>
>Since I don't have the NT version of Netscape 2.0, I don't know whether they
>have the Unicode support in the langauge option.

Michael - Allow me to disagree again. I don't have the NT version of Netscape 2.0,
but I have the Mac and the Sunos version. On the Mac, in the second-to-last
position of the Option menu, there is "Document Encoding". On the Sun,
in the same position, it reads "Language Encoding". The Mac term is definitely
better, because this has only marginal connections with language; with a
document encoding such as Latin-2, you can cover quite a bunch of languages.

If it's this option that you had in mind originally, then I am definitely right that
this setting is only used if the HTTP header does not supply a MIME "charset"
parameter. Both this MIME "charset" parameter and the "Document Encoding"
option serve the *same* purpose, namely, to determine the conversion
from the incomming encoding to whatever is used internally. In a perfect
world, the documents would identify themselves in the HTTP header; the
"Document Encoding" option is only a fix for documents that don't include
the necessary header field and that the user thinks won't make sense when
interpreted as ISO-8859-1 (Latin-1), which is the HTTP/HTML default.
If you have any doubts about this, please read the relevant internet
drafts (among else draft-ietf-html-i18n-03.txt, of which I am a
coauthor).

You also mention a "Language Option" above. Apart from what should
be better called "Document Encoding", but appears in some versions
of Netscape as "Language Encoding", there is also an option on
the Mac that is rightfully called "Language". It appears under
"General Preferences...". It serves to choose your language preferences,
i.e. whether you want a document served in English, Japanese, German,
American English, British English, and so on (provided it exists in that
language on the server side). This is something that will still be
of interest even if we (hopefully) soon don't have to select "Document
Encoding" anymore (because the documents are correctly marked
in the HTTP header, or because they all come in Unicode or UTF-8).

Hope this help. Regards, Martin.



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:30 EDT