Re: There is no ANSI in Microsoft

From: Frank da Cruz (fdc@columbia.edu)
Date: Wed Mar 22 2000 - 10:05:57 EST


Markus Kuhn <Markus.Kuhn@cl.cam.ac.uk> wrote:
> Chris Pratley wrote on 2000-03-22 04:00 UTC:
> > First, I want to clarify that the "native encoding" of Windows2000 is
> > Unicode. What you are referring to as "native encoding" is actually the
> > emulated encoding, usually called the ANSI code page of the system. In Hong
> > Kong, this should be "Big5" encoding.
>
> Seeing Microsoft employees (especially on a coded character set
> specialist mailing list!!!) referring to their "code pages" as something
> associated with ANSI makes me cringe on a regular basis. Big5 is not an
> ANSI standard. Neither is CP1252. In fact, the only ANSI standards that
> are remotely relevant to any of the Microsoft code pages are ANSI/ISO
> 8859 and ANSI/ISO 10646, both of which did not originate within ANSI but
> within ECMA and ISO and are far more widely and appropriately referred
> to as ISO standards. In this sense, Unicode is even more an "ANSI code
> page" than CP1252.
>
> In general: the word "ANSI" is completely and utterly inappropriate if
> mentioned in any context whatsoever with a Microsoft character encoding.
>
> You could do me a big favour by doing a full-text search on all
> Microsoft documentation and eliminate the word ANSI from all character
> set contexts. Or at least, check if you have some corporate style-guides
> or glossary and eliminate "ANSI" from there and add a note that
> describes "ANSI" as an inappropriate historic in-house terminology for
> (ANSI/)ISO 8859-1, which is not identical to any of the Microsoft code
> page used today. Thanks!
>
> It might sound pedantic, but Microsoft's continuous abuse of the term
> ANSI in the context of their encodings is already finding its way into
> (bad) textbooks and other non-Microsoft products.
>
Seconded. The misuse of the term ANSI -- not only in this context -- is
enough to give anybody who has an appreciation of standards a stomach ache.
Sometimes I wonder how ANSI itself feels about it. If I were ANSI, I'd feel
the same way as Xerox or Kleenex about the application of their names to
other companys' products.

The same thing has been going on with terminal emulation for the past 20
years. There is (or was) indeed an ANSI standard -- X3.64 -- for embedding
terminal controls ("escape sequences") in the data stream. It was followed
by terminals such as the DEC VT220 and above, and is still used today in
(e.g.) the Linux, AIX, and SCO consoles. It is well-designed, consistent,
and extensible. It allows control sequences to be parsed out of the data
stream without prior knowledge of the repertoire, based purely upon their
structure.

Unfortunately, in common parlance "ANSI terminal emulation" has nothing to
do with ANSI X3.64. Instead (and here's the relevance) it refers to the
ANSI.SYS console driver, used on the original IBM PC, which indeed used a
couple of ANSI X3.64-format escape sequences, plus quite a few that did not
fit, and used an PC code page for its character set, the hallmarks of which
were the decidedly anti-ANSI use of the C1 area for graphic characters, a
plethora of line- and box-drawing characters, and a complete lack of
correspondence with any standard character set.

ANSI is a standards organization. It produces standards for everything
from stepladders and screw threads to coded character sets. Thus calling
something "ANSI" without reference to *which* ANSI standard we're talking
about is a good sign we're not talking about *any* ANSI standard.

- Frank



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:00 EDT