RE: Character set cluelessness

From: Doug Ewell <>
Date: Tue, 02 Oct 2012 14:15:03 -0700

Mark Davis 🍕 <mark at macchiato dot com> wrote:

> I tend to agree. What would be useful is to have one column for the
> city in the local language (or more columns for multilingual cities),
> but it is extremely useful to have an ASCII version as well.

They have two name fields, one ("Name") for the name transliterated into
Latin, and a second ("NameWoDiacritics") which is an ASCII-smashed
version of the first. Again, that's fine as long as I am free to ignore
the ASCII version. They don't attempt to represent names in non-Latin
scripts, which is not my beef here.

There are many names in the "Name" (i.e. "beyond ASCII") field that
include characters beyond 8859-1, such as œ and ̆z, and certainly many
beyond CP437. This is a good thing (although there are some errors, not
as many as in past years), but they need to fix their documentation to
reflect what they actually do, and not make these irrelevant,
misleading, and/or inaccurate references to 437 and to a 19-year-old
version of 10646.

Doug Ewell | Thornton, Colorado, USA | @DougEwell &shy;
Received on Tue Oct 02 2012 - 16:15:52 CDT

This archive was generated by hypermail 2.2.0 : Tue Oct 02 2012 - 16:15:52 CDT