Re: Communicator and Unicode revisited

From: Glenn Adams (glenn@spyglass.com)
Date: Tue Sep 16 1997 - 03:34:19 EDT


At 10:12 PM 9/15/97 -0700, Adrian Havill wrote:
>Do the other major browsers out that (IE and Hotjava, for example) plan
>to announce "UTF-8" support in the HTTP header as well? If so, this is
>very significant in that a very large portion of the personal computer
>browser market will finally be announcing "yes, I speak Unicode." This,
>IMO, is the biggest boost for Unicode since Java.

In Spyglass' International Device Mosaic 2.0 product, ACCEPT-CHARSET is
sent with every HTTP request. The value sent is an ordered list
of CHARSET designators consisting of the current default CHARSET (as specified
by user preferences), the default CHARSET for the current preferred content
language (this language is also specified by user preferences), followed by
UTF-8, UNICODE-2-0, then '*'.

>2) Can Navigator read a UCS-2 file if it doesn't have a byte-order mark
>in the front? I've tried both big and little endian formats, as well as
>setting the header to return "UNICODE-1-1" (the real HTTP header, not
>the META tag) (is there a "UCS-2" charset type?) even though the header
>is there, if the byte order isn't, it garbles both big and little
>endian. U+FEFF there, no problem.

I can't say what Navigator does, but in the Spyglass browser I referred to above,
we auto-detect UCS-2 (i.e., 16-bit Unicode) content with or without a byte
order mark in either standard (big-endian) or non-standard (little-endian) byte order.
Note that the Unicode standard requires byte serialization of Unicode in big-endian
order; however, most servers don't pay much attention to the encoding, so one has
to accommodate the chaos.

Regards,
Glenn Adams



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:36 EDT