Re: Encoding designation in Java Script sites

From: Addison Phillips [GSC] (addison@globalsight.com)
Date: Tue Apr 11 2000 - 14:55:28 EDT


Hi Suzanne,

Javascript based sites are essentially HTML sites, so the encoding is either
in the http header or in a META tag (or non-existant). A META tag is
problematic because it forces the browser's parser to re-read the page from
the beginning, scrapping any data that was previously interpreted and
restarting the Javascript parser (if you've already starting processing a
script). Non-existant is bad for the obvious reasons. The best place for
such a tag is in the http header (where it is essentially invisible to the
end-user: you won't see it when viewing the page source).

XML uses this tag to indicate how to translate the file:

<? XML version=1.0 encoding="Big5"?>

Note that XML is natively Unicode by definition [although most XML books are
amusingly silent about what that means: my copy of The XML Handbook, for
example, says that XML is in Unicode and that there is an encoding called
UTF-8 which is compatible with ASCII...... but frustratingly, it doesn't say
what "XML is in Unicode" *means* in terms of actual disk file encoding or
internal parsing... it turns out that most parsers use UCS-4 or UTF-16 in
their rendering engine and smart implementers use UTF-8 when storing the
actual XML files on disk. Yes, you have to declare the encoding for UTF-8.
Byte Order Marks--0xFFFE--are the order of the day for UTF-16 files].

The encoding is how to decipher the disk file to make it into your parser's
internal "Unicode" [I'm grossly oversimplifying here, of course]. The XML
experts on this list can describe this process much more succinctly than I
can, probably...

thanks,

Addison

Addison P. Phillips
Senior Globalization Consultant
Global Sight Corporation
mailto:addison@globalsight.com
================================
(+1) 408.350.3600 - Telephone
http://www.globalsight.com
================================
Going global with your web site? Global Sight provides Web-based software
solutions that simplify the process, cut costs, and save time.
----- Original Message -----
From: Suzanne Topping <stopping@rochester.rr.com>
To: Unicode List <unicode@unicode.org>
Sent: Tuesday, April 11, 2000 7:50 AM
Subject: Encoding designation in Java Script sites

> Hello,
>
> Can someone tell me where the encoding method is indicated in Java
> script-based web sites? I was just looking through the source of a few
> sites, and couldn't find any char-set designations.
>
> How about XML sites?
>
> Thanks!
>
> --++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Suzanne Topping
> Localization Unlimited
> (Globalization Process Improvement Consulting and Training)
>
> In association with BizWonk (TM)
>
> Phone: 716-473-0791
> Fax: 716-231-2013
> Email: stopping@rochester.rr.com
>
> (Send me an email to join the North East Localization Special Interest
> Group, an email distribution list which acts as a discussion forum for
> localization issues.)
>



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:01 EDT