Re: Security Issues

From: Peter Kirk (
Date: Sun Apr 03 2005 - 16:25:13 CST

  • Next message: John Hudson: "Re: Sindhi characters proposed"

    On 03/04/2005 21:29, Doug Ewell wrote:

    >There was comparatively little urgency with regard to the speakers of
    >German, Polish, Kobon, and Sencoten, who are already familiar with the
    >Latin script but require letters that aren't available in non-IDN domain
    >names. They had gotten along with Basic Latin approximations for years,
    >and were largely expected to continue to do so. Domain names, after
    >all, are not usually expected to be linguistically perfect.
    Speakers of German and Polish are at least used to being forced to
    mangle their languages to fit in with American ideas of what letters are
    acceptable, by avoiding letters which they were using in their countries
    at a time no one was writing anything in America (Mayan writing having
    died out before the Europeans arrived, I think). But don't assume that
    the same is true of speakers of Kobon and Sencoten, who may be entirely
    unused to their languages being mangled in this way, and whose languages
    may actually be rendered unintelligible if certain distinctions are
    lost. Actually this is true of less obscure languages as well: in
    Azerbaijani öldü means "he/she/it died", but the mangled version of this
    which might be acceptable for a URL, oldu, means "he/she/it became", in
    other words potentially the exact opposite. So diacritics are not
    optional, in many languages.

    > ...
    >>Actually, does anyone want U+026B? This is not a click. Perhaps you
    >>were thinking of U+01C2.
    >Vlad had written, "L WITH MIDDLE TILDE is used orthographically in
    >Kobon." I assumed he meant U+026B LATIN SMALL LETTER L WITH MIDDLE

    Thank you, I had missed that and thought you were referring to the
    clicks which Vlad also mentioned.

    >U+01C2 LATIN LETTER ALVEOLAR CLICK, on the other hand, doesn't look at
    >all like an L with middle tilde.
    No, but it does look like a small L with a double bar across it, at
    least in sans-serif - and so like the double-barred L proposed in and accepted by the UTC
    as provisional U+2C61.

    Mark Davis wrote:

    >(b) it is part of a bicameral script and doesn't have an uppercase, which is
    >the situation for
    >026B ; LATIN ; Atomic-no-uppercase # L& (ɫ) LATIN SMALL LETTER L
    Maybe this is the formal situation in the current version of Unicode,
    but also has evidence
    that this letter does have an uppercase, although the evidence is only
    for one little used language, and this uppercase has been accepted by
    the UTC as provisional U+2C62. So the only difference between this and
    Polish is that the latter has more speakers.

    Later, Mark wrote:

    >But if 'k' really were not used in any real publications in a modern
    >language, then it would be a different story (see my previous message).
    Sorry to quote yet
    again, but this gives evidence of these letters being used in real
    publications in a modern language.

    Peter Kirk (personal) (work)
    No virus found in this outgoing message.
    Checked by AVG Anti-Virus.
    Version: 7.0.308 / Virus Database: 266.9.1 - Release Date: 01/04/2005

    This archive was generated by hypermail 2.1.5 : Sun Apr 03 2005 - 16:26:08 CST