From: Doug Ewell (dewell@adelphia.net)
Date: Mon Jun 06 2005 - 09:08:15 CDT
Antoine Leca <Antoine10646 at leca dash marti dot org> wrote:
>> It's a contrived example, but the string "NESTLÉ™" encoded in Latin-1
>
> It is a minor nit, but ™ (U+2122) does not appear in my Latin-1 (ISO/
> IEC 8859-1:1998) charts; of course, this character appears at position
> 9/9 in the Windows 1250, 1252, 1254, 1257, 1258 codepages (and also in
> some others, but those do not have É at 12/9).
Arrggh... you are right, of course, and I am guilty of the same mistake
I've seen many times before.
Of course, the string I gave as an example was meant to be encoded in
Windows code page 1252, not in ISO 8859-1 as I said. U+0099 doesn't
even have a name.
It is still possible to come up with a plausible example of text that is
both valid UTF-8 and plausible Latin-1, and I need to find one -- not
only because my current example is Windows-specific, but also because
Nestlé is not even a trademark (™) but a registered trademark (®).
-- Doug Ewell Fullerton, California http://users.adelphia.net/~dewell/
This archive was generated by hypermail 2.1.5 : Mon Jun 06 2005 - 09:10:22 CDT