Re: UN/LOCODE perspective on character sets from Mark Davis ☕️ on 2015-12-18 (Unicode Mail List Archive)

From: Mark Davis ☕️ <mark_at_macchiato.com>
Date: Fri, 18 Dec 2015 08:26:44 +0100

Haven't looked it over in detail, but here is the notice:

http://www.unece.org/fileadmin/DAM/cefact/locode/2015-2_UNLOCODE_SecretariatNotes.pdf

From a quick scan: They've added latitude/longitude (to the minute, ~2km);
that's great because often the names of locations are ambiguous.

They still have deviations from the IATA codes, and various strange
omissions. And (as you note) they don't include the native name, unless it
can be spelled with a *subset* of Latin-1 characters (ugg). They list the
ISO subdivision code sometimes, but no consistent inclusion relations for
other codes (eg, they do have that San Francisco is in California, but they
miss many other similar relations in other countries). And the
latitude/longitude is often missing.

More at http://www.unece.org/cefact/locode/welcome.html

Mark

On Thu, Dec 17, 2015 at 10:19 PM, Doug Ewell <doug_at_ewellic.org> wrote:

> UN/LOCODE version 2015-2 has been released [1], and the Manual still
> contains the following about character sets:
>
> "27. Place names in UN/LOCODE are given in their national language
> versions as expressed in the Roman alphabet using the 26 characters of
> the character set adopted for international trade data interchange, with
> diacritic signs, when practicable (cf. Paragraph 3.2.2 of the UN/LOCODE
> Manual). International ISO Standard character sets are laid down in ISO
> 8859-1 (1987) and ISO10646-1 (1993). (The standard United States
> character set (437), which conforms to these ISO standards, is also
> widely used in trade data interchange)."
>
> Spot the errors.
>
> [1] http://www.unece.org/cefact/codesfortrade/codes_index.html
>
> --
> Doug Ewell | http://ewellic.org | Thornton, CO 🇺🇸
>
>
>
Received on Fri Dec 18 2015 - 01:26:44 CST

This archive was generated by hypermail 2.2.0 : Fri Dec 18 2015 - 01:28:14 CST