Re: Correct definition for an "isLatin1()" function

From: David Starner (dvdeug@x8b4e516e.dhcp.okstate.edu)
Date: Thu Oct 05 2000 - 13:10:31 EDT


On Thu, Oct 05, 2000 at 08:35:31AM -0800, Rogers, Paul wrote:
> Hi, all,
>
> We're whipping up a little function named isLatin1() that returns true if
> the (UCS-2) string in question is "all Latin1".
>
> I'm curious as to how Unicode experts would define the correct range of code
> values that would indicate the string is Latin1.
>
> Perhaps it's obvious and super trivial, but I'm just looking for validation
> that our isLatin1() function would return true if each code value is in any
> of the following ranges:
>
> * ASCII + Basic Latin, U+0020 - U+007E (Or should we include U+007F,
> Delete?)
> * Latin-1 Supplement, U+00A0 - U+00FF
>
> In other words, should we exclude the C0, C1, and Latin Extended code
> values?

I don't know, should you? If the goal is to map it to ISO 8859-1, then
it looks like
        function isLatin1 (C : Wide_Character) return Boolean is
        begin
                return (C < 16#100#);
        end isLatin1;
That is, include C0 and C1, since they are a part of ISO 8859-1. But
without knowing what you want to do with it, I can't say that that's
the right answer.

-- 
David Starner - dstarner98@aasaa.ofe.org
http/ftp: dvdeug.dhis.org
And crawling, on the planet's face, some insects called the human race.
Lost in space, lost in time, and meaning.
	-- RHPS



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:14 EDT