Re: FW: Japanese characters -- HTML for both MAC and Windows...

From: Otto Stolz (Otto.Stolz@uni-konstanz.de)
Date: Tue Jun 23 1998 - 15:24:27 EDT


On 19 June 1998, Cristina Mateo has written:
> I need to write Japanese characters.
> How do I do that in HTML for both Mac and Windows?

HTML 4.0 <http://www.w3.org/TR/REC-html40/> has adopted the UCS
(ISO/IEC 10646-1:1993 + TAs, or Unicode 2.0 and above) as the document
character set for HTML. This means that you can, in principle, use any
UCS character in any WWW page that includes a HTML 4.0 version declaration,
see <http://www.w3.org/TR/REC-html40/struct/global.html#h-7.2>.

Of course, what characters the reader of your page will be able to see,
depends on the browser, and fonts, installed by him/her. Many modern
browsers can display Japanese characters; I have tried Netscape Navigator
4.0 <http.//www.netscape.com/download/>, Internet Explorer 4.0
<http://www.microsoft.com/products/prodref/651_ov.htm>, and Tango 3.1.1
<http./www.alis.com/internet_products/index.en.html>. All of N-Navigator
for Unix, for MS-Windows, Tango, come with their own fonts, including
Japanese syllable, and ideographic, characters; I cannot say anything
about browsers available for the Mac, though. MS-IExplorer uses the fonts
installed under the Windows 95, or NT, system; your readers may wish to try
the Bitstream Cyberbit (proporsional) and Everson Mono 10646 (monospace)
fonts, see <http://www.bitstream.com/products/world/cyberbits/ftpcyber.html>
and <ftp://dkuug.dk/CEN/TC304/EversonMono10646>, respectively. Under
Windows 95, some non-Latin alternate keyboard must be installed, before
non-Latin fonts can be installed, and viewed.

As a HTML author, you can
- either encode your whole page in UTF-8,
  see <http://www.stonehand.com/unicode/standard/wg2n1036.html>,
  so it may contain any UCS character,
- or use a more limited transfer-encoding
  (such as the omnipresent ISO 8859-1 Latin-1),
  and denote additional characters
  (such as Japanese ones in the Latin-1 case)
  by numerical character references,
  see <http://www.w3.org/TR/REC-html40/charset.html#h-5.3>.
You can study examples for these techniques,
under <http://www.reuters.com/unicode/iuc10/x-utf8.html>
and <http://www.reuters.com/unicode/iuc10/x-ncr.html>, respectively.

When composing HTML pages, you should also properly tag the text parts
with their respective languages, see
<http://www.w3.org/TR/REC-html40/struct/dirlang.html#h-8.1>.

Best wishes,
   Otto Stolz



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:40 EDT