Article in Financial Times; Feb 7, 2001

From: J M Sykes (
Date: Thu Feb 08 2001 - 07:17:27 EST

I got a reference to the following from ACM TechNews - Wednesday, February
7, 2001, so some may have seen it already.

It shows a degree of ignorance that I would hardly have believed possible in
a reputable newspaper. I know there are vocal members of this list who are
more knowledgeable on the subject than I, and I invite them to offer

The email address for letters to the FT is

If I don't hear that someone has accepted this invitation, I shall do my own
humble best.

As a taster, I append a few quotes:


Until recently, even the accents commonly used above and below letters in
German, French and Spanish could cause problems. This was because the
original system for representing letters on computers - known as Ascii
coding - set in stone only half of the 256 codes available to identify
different characters. (The basic unit of computing data used to do this is
the byte, each of which has eight bits, which can be either 0 or 1. That
gives 256 alternatives.)

This system covered the standard upper and lower case alphabet, numbers,
punctuation and a few special symbols found on keyboards, such as currency
and percentage symbols.

The other 128 codes could be used arbitrarily. Printers used them to create
special effects. Software applications used them, among other things, to
represent accented characters. But different applications adopted different
standards. This is why you still occasionally see gobbledegook in e-mail
messages from other countries.

The International Standards Organisation (ISO) has now agreed to give
standard meanings to these remaining codes. The new standard is known as
'Latin-1' or 'extended Ascii' and includes accented characters.

Note the "now", and "new".

Another coding system has been devised to cater for Asian languages such as
Chinese, Japanese and Korean. These languages have thousands of ideographic
characters, each representing a single word. A coding system called Unicode
has emerged as the standard. This uses twice as much data to represent each
character, and so is known as a 'double-byte' coding system.

Note the "Another"

There will be improvements as the Unicode standard becomes adopted in the
next version of HTML - the computer language that underpins information
display on the web.


More information is available via the website:

Copyright: The Financial Times Limited

Even allowing that journalists can't be expected to be experts on
everything, and some standards take a long time to be widely adopted (and
some never are!), the extracts above seem to me to give a rather distorted



J M Sykes Email:
97 Oakdale Drive
Heald Green
Cheshire SK8 3SN
UK Tel: (44) 161 437 5413


This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:18 EDT