Re: OT: Correct definition for an "isLatin1()" function

From: Frank da Cruz (fdc@columbia.edu)
Date: Thu Oct 05 2000 - 15:57:07 EDT


Michael Kaplan <RANT>ed:
> The assumption here is that the function will be run on Unicode text.
> Therefore, the various industrial and other code pages are irrelevant.
> Microsoft does not convert the characters it has in the control code range
> to those same code points in Unicode, does it? Indeed, a MultiByteToWideChar
> call on these code points using cp1252 does not leave them as control codes.
>
> No need to let this degenerate into a "Why Microsoft (and its code pages)
> suck" discussion, truly. However, there are several newsgroups:
>
But we don't know where the Unicode data came from. It might very well
have come from CP1252 converted as if it were Latin-1. Or for that matter,
Latin-1 converted as if it were CP1252. Regular readers of this list know
that this happens all the time, but those asking basic questions might not
know. Therefore this can be helpful information. As is, I trust, the
part about normalization.

- Frank



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:14 EDT