From: CE Whitehead (cewcathar@hotmail.com)
Date: Wed Nov 24 2010 - 22:33:50 CST
Hi, according to
http://www.icann.org/en/general/idn-guidelines-20jun03.htm
the following is a "should;" I guess it's not a "must" (correct me if I am wrong):
"5. In implementing the IDN standards, top-level domain registries should, at least initially, limit any given domain label (such as a second-level domain name) to the characters associated with one language or set of languages only."
http://www.iana.org/domains/idn-tables/
lists characters and character treatment rules for different languages for top-level registrars; however I see nothing for Telugu or Kannada . . .
(that's to come I suppose)
ICANN accredits registrars; see:
http://www.icann.org/en/registrars/accredited-list.html
And you have a right to register problems with accredited registrars with icann if you can't resolve them with the registrar according to the following:
http://www.icann.org/en/registrars/accreditation.htm
One note: for Arabic there are two sets of Indic digits with some digits being identical; both sets of Indic digits are allowed which can thus lead to the registration of confusables (I mentioned this before; since the alphabets are essentially the same you can have banuk1.com with an Eastern 1 in one language confusable with banuk1.com with a Western 1 in Arabic itself; see:
http://www.iana.org/domains/idn-tables/tables/xn--mgberp4a5d4ar_ar_1.0.html
"4.Numbers
In the Arab world, there are two sets of numerical digits used:
I.From U+0030 (Digit Zero) to U+0039 (Digit Nine)
Mostly used in the western part of the Arab world (al-maghrib al-arabi).
II.From U+0660 (Arabic-Indic Digit Zero) to U+0669 (Arabic-Indic Digit Nine),
Mostly used in the eastern part of the Arab world (al-mashriq al-arabi).
Hence, both sets should be supported in the user interface and both are folded to one set (Set I)
at the preparation of internationalized strings (e.g., "stringprep") phase."
Best,
--C. E. Whitehead
cewcathar@hotmail.com
________________________________
> Date: Wed, 24 Nov 2010 17:07:37 +0530
> Subject: Re: Phishing and enforcing Confusables.txt
> From: akshat.gist@gmail.com
> To: samjnaa@gmail.com
> CC: unicode@unicode.org
>
> Dear Shriramana,
>
> IMO, the authoritative body in this case has to be the registry that is
> holding the Top Level Domain. (.com in this case)
> There are different bodies for various TLDs.
> If such kind of phishing attacks are to be prevented, the registry
> operating bodies need to be made aware of Confusables.txt and the need
> of handling the same.
>
> Regards,
> Akshat
>
>
> On Wed, Nov 24, 2010 at 2:39 PM, Shriramana Sharma
> > wrote:
> Dear all,
>
> A friend of mine who is in the computer security industry told me that
> Confusables.txt is NOT enforced across the world. For example, despite
> there existing a website అపార.com with a Telugu అ
> registered somewhere
> in the world, another (phishing) website ಅపార.com
> with a Kannada ಅ may
> be later registered elsewhere in the world despite the following
> confusable mapping in the Confusables.txt:
>
> 0C85 ; 0C05 ; ML # ( ಅ → అ ) KANNADA LETTER A → TELUGU LETTER A #
>
> I certainly hope this is not true! Please clarify. Is there no
> authoritative body to prevent such duplicate encoding? Doesn't the
> IANA do this?
>
> Shriramana Sharma.
>
>
>
This archive was generated by hypermail 2.1.5 : Wed Nov 24 2010 - 22:39:07 CST