RE: Identifying file encoding scheme

From: Kenneth Whistler (kenw@sybase.com)
Date: Mon Sep 13 1999 - 21:33:28 EDT


> if lpBuffer points to
> the ASCII string 0x41, 0x0A, 0x0D, 0x1D (A\n\r^Z), the string passes the
> IS_TEXT_UNICODE_STATISTICS test, though failure would be preferable.

Picking nits, I presume you mean:

0x41, 0x0D, 0x0A, 0x1A (A\r\n^Z)

(Starting with Unicode 3.0, U+410D U+0A1A actually is a valid sequence of
two assigned Unicode characters: U+410D is a *very* rare alternate for
the "common" form U+8721, referring to a year-end festival of the Zhou
dynasty. U+0A1A is the GURMUKHI LETTER CA. Not exactly a likely combination
in real text, I warrant.)

--Ken



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:51 EDT