Re: Glaring mistake in the code list for South Asian Script

From: John H. Jenkins <jenkins_at_apple.com>
Date: Thu, 08 Sep 2011 12:25:46 -0600

The Latin script covers alphabets for languages other than Latin.

The Arabic script covers alphabets for languages other than Arabic.

CJK Ideographs aren't ideographs.

U+FE18 PRESENTATION FORM FOR VERTICAL WHITE LENTICULAR BRAKCET isn't a brakcet.

And so on.

I realize it is frustrating when Unicode and related standards show apparent indifference to getting names absolutely right. The practical reality is that Unicode's intention is to use names which reflect standard or common English usage, not to be incontrovertibly correct. Experts (or native speakers) may well use or prefer different terminology. In some cases, such as Burmese, the terminology involved can be controversial, often for political reasons. Almost never is it true that *everybody* agrees on a name/term.

Moreover, for stability reasons, Unicode names can well be frozen. Even if everybody comes to agree that a given name is absolutely and completely wrong, we can get stuck with it. There was a time when Unicode was willing to change names, but that proved to be a very bad idea.

The net result is that Unicode is loaded with misnomers. And yes, this is unfortunate and often very embarrassing—but so long as Unicode does its intended job and makes it possible for people to represent texts written in the various languages it covers, it's something we just have to live with.

See also <http://www.unicode.org/faq/basic_q.html#4>.

=====
Siôn ap-Rhisiart
John H. Jenkins
jenkins_at_apple.com
Received on Thu Sep 08 2011 - 13:29:29 CDT

This archive was generated by hypermail 2.2.0 : Thu Sep 08 2011 - 13:29:30 CDT