Re: Security Issues

From: Mark Davis (
Date: Sun Apr 03 2005 - 14:16:31 CST

  • Next message: Doug Ewell: "Re: Security Issues"

    Each character in the FOR REVIEW list is collected because either:

    (a) it would not count as part of an XID, or
    (b) it is part of a bicameral script and doesn't have an uppercase, which is
    the situation for

    026B ; LATIN ; Atomic-no-uppercase # L& (ɫ) LATIN SMALL LETTER L

    In either case there is prima facie reason for some level of scrutiny, if
    the goal to be initially conservative in repertoire. (In this, I am not
    necessarily advocating one or another approach; simply trying to gather
    information so that informed judgments can be made.)

    The WORD CHARACTERS ADDED list also need review, especially for the
    MODIFIER LETTERs, to see which if any of those (if any) are used in modern
    languages. I am hoping to get information back like the following

    Referring to the characters at the end of
    0264  ; Ll # (ɤ)  LATIN SMALL LETTER RAMS HORN
    needs to be included. It is lowercase in form only; it is used caselessly,
    which explains the lack of uppercase. It is used for Wičita, a modern
    language spoken in Kansazia, with several weekly newspapers (eg
    018C          ; LATIN ; Atomic # L& (ƌ) LATIN SMALL LETTER D WITH TOPBAR
    is only used for Northeastern Squamish. There are no regular modern
    publications using this character, outside of articles on linguistics.
    026E  ; Ll # (ɮ)  LATIN SMALL LETTER LEZH
    02C2  ; word-chars # Sk (˂)  MODIFIER LETTER LEFT ARROWHEAD
    02C3  ; word-chars # Sk (˃)  MODIFIER LETTER RIGHT ARROWHEAD
    02C4  ; word-chars # Sk (˄)  MODIFIER LETTER UP ARROWHEAD
    02C5  ; word-chars # Sk (˅)  MODIFIER LETTER DOWN ARROWHEAD
    02D2  ; word-chars # Sk (˒)  MODIFIER LETTER CENTRED RIGHT HALF RING
    are only used in the Danish Gua'uld system of phonetic transcription, not
    for any modern language.
    ----- Original Message ----- 
    From: "Peter Kirk" <>
    To: "Doug Ewell" <>
    Cc: "Unicode Mailing List" <>
    Sent: Sunday, April 03, 2005 12:15
    Subject: Re: Security Issues
    > On 03/04/2005 17:27, Doug Ewell wrote:
    > > ...
    > >
    > >There's also a significant controversy surrounding the ability of some
    > >evil person to register "paypaɫ.com" or similar, using a letter like
    > >U+026B that most people in the world aren't aware exists, ...
    > >
    > The standard should not pander to ignorance. Don't forget that there are
    > billions of Chinese, Indians etc who are not familiar even with our
    > basic ABC.
    > >... and using it
    > >to dupe innocent consumers.  People are running around screaming that
    > >internationalized domain names are evil for allowing these characters,
    > >and that Unicode is evil for including them in the first place.  This
    > >"security" thread is an attempt to work out the best solution for all.
    > >
    > >
    > >
    > I see the point. But if we are going to allow U+0142 to support Polish,
    > and so to allow anyone to register "paypał.com", then there is not much
    > difference allowing them to use "paypaɫ.com", with U+026B. Perhaps
    > U+0142 and U+026B can be listed as lookalikes. Actually, does anyone
    > want U+026B? This is not a click. Perhaps you were thinking of U+01C2.
    > -- 
    > Peter Kirk
    > (personal)
    > (work)
    > -- 
    > No virus found in this outgoing message.
    > Checked by AVG Anti-Virus.
    > Version: 7.0.308 / Virus Database: 266.9.1 - Release Date: 01/04/2005

    This archive was generated by hypermail 2.1.5 : Sun Apr 03 2005 - 14:17:57 CST