Re: FW: Japanese characters -- HTML for both MAC and Windows...

From: Aki Inoue (aki@apple.com)
Date: Mon Jun 29 1998 - 17:08:46 EDT


Looks like I'm very fortunate to be living on a platform that have a
browser capable of displaying your HTML source properly. I tried
your source with:
- Rhapsody DR2
- OmniWeb3 Beta7

 Browser | enc | text window | title bar | menus | source
 -------------+-----+-------------+------------+------------+------------
 OmniWeb | UTF | ok | ok | ok | ok
              | dec | ok | not tested | not tested | N/A
              | hex | ok | not tested | not tested | N/A
              | HEX | ok | not tested | not tested | N/A

I had to change encoding setting (the browser does not support META
tag with charset key) to UTF8 manually, but otherwise it displayed
them all properly.

Aki Inoue
Apple Computer Inc.

Begin forwarded message:

> From: Otto Stolz <stolz@iris.rz.uni-konstanz.de>
> Date: Mon, 29 Jun 1998 08:31:56 -0700 (PDT)
> To: Unicode List <unicode@unicode.org>
> Subject: Re: FW: Japanese characters -- HTML for both MAC and
Windows...
> Cc: chris@twin.ftech.co.uk
> X-Uml-Sequence: 5414 (1998-06-29 15:31:57 GMT)
>
> On 19 June 1998, Cristina Mateo had written:
> > I need to write Japanese characters.
> > How do I do that in HTML for both Mac and Windows?
>
> On 1998-06-23, I have written:
> > HTML 4.0 <http://www.w3.org/TR/REC-html40/> has adopted the UCS
> > (ISO/IEC 10646-1:1993 + TAs, or Unicode 2.0 and above) as the
document
> > character set for HTML. This means that you can, in principle,
use any
> > UCS character in any WWW page that includes a HTML 4.0 version
> declaration,
> > see <http://www.w3.org/TR/REC-html40/struct/global.html#h-7.2>.
> >
> > Of course, what characters the reader of your page will be able
to see,
> > depends on the browser, and fonts, installed by him/her.
>
> Today, I have conducted a little test with three browsers and the HTML
> source
> quoted below (which was stored locally on mysystem).
> The browsers I have tested are:
> - Alis Tango v3.1.1 c1.0
> - Microsoft Internet Explorer 4.0 Version 4.71.1712.6
> - Netscape Communicator 4.05
> all running under
> - Microsoft Windows 95 4.00.950.B
>
> The sad result: no browser of my sample were fully compatible with the
> HTML 4.0 specification! The best results over all browsers are obtained
> with decimal NCRs; the most popular browsers do not understand
> hexadekadic
> NCRs. (While the HTML authors will understand decimal NCRs only by
means
> of a hex-dec-converting calculator!)
>
> Best wishes,
> Otto Stolz
>
> --------------- Appendix ----------
>
> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN"
> "http://www.w3.org/TR/REC-html40/strict.dtd">
> <HEAD>
> <META http-equiv="Content-Type" content="text/html;
> charset=UTF-8">
> <TITLE>&#30849;-Test</TITLE>
> <BODY>
> <H1>&#30849; &#8212; my favourite game</H1>
> <P>In HTML 4.0,
> I can code the character for Go (aka Baduk or Wei-ch'i)
> in the following ways:
> </P>
> <TABLE BORDER>
> <TR><TH><A HREF="http://www.w3.org/TR/REC-html40/charset.html">
> <CITE>Character encoding</CITE></A></TH>
> <TH>in <A
> HREF="http://www.stonehand.com/unicode/standard/wg2n1036.html">
> UTF-8</A></TH>
> <TD>E7 A2 81</TD>
> <TD>碁</TD></TR>
> <TR><TH ROWSPAN=3><A
>
HREF="http://www.w3.org/TR/REC-html40/charset.html#h-5.3.1">
> <CITE>Numeric character reference</CITE></A></TH>
> <TH>decimal</TH>
> <TD>&amp;#30849;</TD>
> <TD>&#30849;</TD></TR>
> <TR><TH ROWSPAN=2>hexadekadic</TH>
> <TD>&amp;#x7881;</TD>
> <TD>&#x7881;</TD></TR>
> <TR><TD>&amp;#X7881;</TD>
> <TD>&#X7881;</TD></TR>
> </TABLE>
> <P>What does your browser display for each?
> Note also the document title in your browser windows title bar,
> or in the "Document Info" window.
> </P>
> </HTML>
> Beware: the last cell in the first row of the table contains three
> non-ASCII bytes, which will be MIME-q-p encoded, in this
letter.
> If theis cell does not display properly on your system, it
may have
> been distorted by the mail-transferring process. In this
case, insert
> three characters with the hex values given in the 3rd
column of the
> same row.
>
> The title, the header, and the last row of the table should
contain the same
> han/kanji character.
> The following table (please use a mono-pitch font to display it)
> summarizes
> how these browsers display the various encodings of a han/kanji
character
> in
> various contexts. The encodings are denoted by the abbreviations
> UTF for UTF-8
> dec for a decimal numerical character reference,
> hex for a hexadekadic, lower-case NCR,
> HEX for a hexadekadic upper-case NCR.
>
> Browser | enc | text window | title bar | menus | source
>
>
-------------+-----+-------------+------------+------------+------------

> Alis | UTF | ok | ok | ok | ok
> Tango | dec | ok | not tested | not tested | N/A
> | hex | ok | not tested | not tested | N/A
> | HEX | missing | not tested | not tested | N/A
>
>
-------------+-----+-------------+------------+------------+------------

> Microsoft | UTF | ok | N/A | N/A | wrong
> Internet | dec | ok | N/A | not tested | N/A
> Explorer | hex | wrong | N/A | not tested | N/A
> | HEX | wrong | N/A | not tested | N/A
>
> -------------+-----+-------------+------------+------------+-----------
> Netscape | UTF | ok | repl.char. | repl.char. |
repl.char.
> Communicator | dec | ok | not tested | not tested | N/A
> | hex | repl.char. | not tested | not tested | N/A
> | HEX | repl.char. | not tested | not tested | N/A
>
> where
> wrong: encoding not recognized,
> byte values interpreted as Latin-1 characters
> N/A: not applicable
> repl.char: encoding recognized but character not available;
> rather, a question-mark or an open box ist displayed
> missing: blank space is displayed rather than the character
>
>
>
>
>
>
>
>
>
>
>
>
>



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:40 EDT