Re: How to distinguish UTF-8 from Latin-* ?

From: Doug Ewell (
Date: Wed Jun 21 2000 - 12:07:41 EDT

Timothy Partridge <> wrote:

> The following bit pattern is not Latin-* (but could be a control code):
> 100xxxxx

This is true, and works fine if you are trying to detect true Latin-1.
However, sometimes "Windows-1252" is meant when "Latin-1" is said, and
then the test does not hold true.

In Windows-1252, only 0x81, 0x8D, 0x8F, 0x90, and 0x9D are undefined.

-Doug Ewell
 Fullerton, California

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:04 EDT