Re: Correct definition for an "isLatin1()" function

From: John Cowan (jcowan@reutershealth.com)
Date: Thu Oct 05 2000 - 13:50:25 EDT

Next message: Michael \(michka\) Kaplan: "Re: do all browsers support UTF-8 encoding???"
Previous message: Paul Deuter: "RE: charset list"
Maybe in reply to: Rogers, Paul: "Correct definition for an "isLatin1()" function"
Next in thread: Frank da Cruz: "Re: Correct definition for an "isLatin1()" function"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

"Rogers, Paul" wrote:

> We're whipping up a little function named isLatin1() that returns true if
> the (UCS-2) string in question is "all Latin1".

[snip]

> In other words, should we exclude the C0, C1, and Latin Extended code
> values?

Including or excluding C0 and C1 is a matter of taste. If you mean
"strictly containing characters in ISO 8859-1", then they're out.
If you mean "representable in typical Latin-1 text files", then at least
C0 is in, and C1 will do no great harm. (Provided your Unicode
characters don't originate from incorrect transcoding from CP 1252.)

The Latin Extended blocks are definitely out.

-- 
There is / one art                   || John Cowan <jcowan@reutershealth.com>
no more / no less                    || http://www.reutershealth.com
to do / all things                   || http://www.ccil.org/~cowan
with art- / lessness                 \\ -- Piet Hein

Next message: Michael \(michka\) Kaplan: "Re: do all browsers support UTF-8 encoding???"
Previous message: Paul Deuter: "RE: charset list"
Maybe in reply to: Rogers, Paul: "Correct definition for an "isLatin1()" function"
Next in thread: Frank da Cruz: "Re: Correct definition for an "isLatin1()" function"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:14 EDT