RE: Unicode on a website

From: Doug Ewell (dewell@compuserve.com)
Date: Sun Sep 24 2000 - 11:04:38 EDT


"Carl W. Brown" <cbrown@xnetinc.com> wrote:

> scsu makes sense for large blocks of data. Send the frame work in
> utf-8 but use HTTP to request the bulk data in scsu. If it is a
> small amount of data you don't want to pay the overhead of the
> compression.

SCSU was intentionally designed to be extremely low in overhead. This
is one of the main differences between SCSU and most other compression
schemes.

> You don't need a BOM with UTF-8.

Not for byte-ordering purposes, but it is often handy as a signature.
Auto-detection of UTF-8 is not difficult, but not foolproof either --
there are legitimate sequences of Latin-1 characters that look like
UTF-8. Using the signature EF BB BF at the beginning of a file is a
more reliable indication that the file is UTF-8.

-Doug Ewell
 Fullerton, California



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:13 EDT