From: Peter Kirk (email@example.com)
Date: Sun Apr 03 2005 - 16:25:13 CST
On 03/04/2005 21:29, Doug Ewell wrote:
>There was comparatively little urgency with regard to the speakers of
>German, Polish, Kobon, and Sencoten, who are already familiar with the
>Latin script but require letters that aren't available in non-IDN domain
>names. They had gotten along with Basic Latin approximations for years,
>and were largely expected to continue to do so. Domain names, after
>all, are not usually expected to be linguistically perfect.
Speakers of German and Polish are at least used to being forced to
mangle their languages to fit in with American ideas of what letters are
acceptable, by avoiding letters which they were using in their countries
at a time no one was writing anything in America (Mayan writing having
died out before the Europeans arrived, I think). But don't assume that
the same is true of speakers of Kobon and Sencoten, who may be entirely
unused to their languages being mangled in this way, and whose languages
may actually be rendered unintelligible if certain distinctions are
lost. Actually this is true of less obscure languages as well: in
Azerbaijani öldü means "he/she/it died", but the mangled version of this
which might be acceptable for a URL, oldu, means "he/she/it became", in
other words potentially the exact opposite. So diacritics are not
optional, in many languages.
>>Actually, does anyone want U+026B? This is not a click. Perhaps you
>>were thinking of U+01C2.
>Vlad had written, "L WITH MIDDLE TILDE is used orthographically in
>Kobon." I assumed he meant U+026B LATIN SMALL LETTER L WITH MIDDLE
Thank you, I had missed that and thought you were referring to the
clicks which Vlad also mentioned.
>U+01C2 LATIN LETTER ALVEOLAR CLICK, on the other hand, doesn't look at
>all like an L with middle tilde.
No, but it does look like a small L with a double bar across it, at
least in sans-serif - and so like the double-barred L proposed in
http://std.dkuug.dk/JTC1/SC2/WG2/docs/n2847.pdf and accepted by the UTC
as provisional U+2C61.
Mark Davis wrote:
>(b) it is part of a bicameral script and doesn't have an uppercase, which is
>the situation for
>026B ; LATIN ; Atomic-no-uppercase # L& (ɫ) LATIN SMALL LETTER L
>WITH MIDDLE TILDE
Maybe this is the formal situation in the current version of Unicode,
but http://std.dkuug.dk/JTC1/SC2/WG2/docs/n2847.pdf also has evidence
that this letter does have an uppercase, although the evidence is only
for one little used language, and this uppercase has been accepted by
the UTC as provisional U+2C62. So the only difference between this and
Polish is that the latter has more speakers.
Later, Mark wrote:
>But if 'k' really were not used in any real publications in a modern
>language, then it would be a different story (see my previous message).
Sorry to quote http://std.dkuug.dk/JTC1/SC2/WG2/docs/n2847.pdf yet
again, but this gives evidence of these letters being used in real
publications in a modern language.
-- Peter Kirk firstname.lastname@example.org (personal) email@example.com (work) http://www.qaya.org/ -- No virus found in this outgoing message. Checked by AVG Anti-Virus. Version: 7.0.308 / Virus Database: 266.9.1 - Release Date: 01/04/2005
This archive was generated by hypermail 2.1.5 : Sun Apr 03 2005 - 16:26:08 CST