Re: Unicode abuse

From: Rick McGowan (rick@unicode.org)
Date: Wed Mar 09 2005 - 20:19:44 CST

Next message: Christopher Fynn: "Re: Encoded rendering instructions (was Unicode's Mandate)"

Previous message: Rick McGowan: "Re: Small Java implementation of NFC"
Maybe in reply to: David Starner: "Unicode abuse"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

A number of mail messages from Mark Davis were not distributed by this
server due to a configuration problem. Attached below is a copy of one such
message.

Rick

--- Below this line is a copy of the message.

From: "Mark Davis" <mark.davis@jtcsv.com>
To: "Erik van der Poel" <erik@vanderpoel.org>,
"Doug Ewell" <dewell@adelphia.net>
Cc: "Unicode Mailing List" <unicode@unicode.org>
Subject: Re: Unicode abuse
Date: Sun, 6 Mar 2005 10:58:22 -0800

I don't view this as a problem, if user-agents take the simple precaution of
displaying IDNs in post-nameprep form, which they really want to do for
other reasons.

â€ŽMark

----- Original Message -----
From: "Erik van der Poel" <erik@vanderpoel.org>
To: "Doug Ewell" <dewell@adelphia.net>
Cc: "Unicode Mailing List" <unicode@unicode.org>
Sent: Saturday, March 05, 2005 15:14
Subject: Re: Unicode abuse

> Doug Ewell wrote:
> > This brings up the topic of "Unicode abuse" in general. Conformance to
> > the Unicode Standard (see DUTR #33, of which Asmus is a co-author)
> > generally refers to support for and adherence to the "letter of the
> > law," things like implementing normalization and casing correctly. It's
> > not quite so easy to quantify adherence to the "spirit of the law," in
> > terms of things like abusing math characters and compatibility
> > characters, or using directional overrides where they don't harm
> > anything and aren't invalid, but also aren't necessary or appropriate.
> >
> > This almost falls into the same category as spoofing, which is being
> > addressed in a different UTR, but seems different somehow.
>
> It's funny that you should mention that today. Why, just yesterday, I
> wrote this new section:
>
> http://nameprep.org/#map-norm
>
> This may be somewhat subjective, but to me, it seems unnecessary and
> inappropriate to allow U+2102 DOUBLE-STRUCK CAPITAL C = the set of
> complex numbers, in HTML links.
>
> This is indeed different from spoofing, but if Nameprep continues to
> allow this type of character in pre-mapped IDNs, we may well see the
> proliferation of yet another type of "garbage" on the Web.
>
> IDN spoofing is done using characters of Stringprep category AO, while
> unnecessary and inappropriate IDN characters are of category MN:
>
> 7.1 Categories of code points
>
> Each code point in a repertoire named by a profile of stringprep can
> be categorized by how it acts in the process described in earlier
> sections of this document:
>
> AO Code points that can be in the output
>
> MN Code points that cannot be in the output because they
> never appear as output from mapping or normalization
>
> D Code points that cannot be in the output because they are
> disallowed in the prohibition step
>
> U Unassigned code points
>
> Cheers,
>
> Erik
>

Next message: Christopher Fynn: "Re: Encoded rendering instructions (was Unicode's Mandate)"
Previous message: Rick McGowan: "Re: Small Java implementation of NFC"
Maybe in reply to: David Starner: "Unicode abuse"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Wed Mar 09 2005 - 20:20:16 CST