There are TWO encodings required here.
I think the original author's problem was with DNS names (hence the
reference to BIND). You can't have Unicode characters in a domain name
using percent encoding. This is the piece where standardization is going
on now (cf. LACE, RACE, UTF-5, iDNS, etc.)
What Edward is referring to is the standard URI encoding already used for
the *rest* of a URI (e.g. after the first '/' character in the URI). This
code is available in a number of places. A good one (embedded in a clear
discussion of URIs and I18n) is located at:
Addison P. Phillips Principal Consultant
Inter-Locale LLC http://www.inter-locale.com
Los Gatos, CA, USA mailto:firstname.lastname@example.org
+1 408.210.3569 (mobile) +1 408.904.4762 (fax)
Globalization Engineering & Consulting Services
On Mon, 27 Nov 2000, Suzanne Topping wrote:
> ----- Original Message -----
> From: "Edward Cherlin" <email@example.com>
> > I was involved in the process that led to Martin's RFC. You can find
> > the archive of that part of our discussion in
> > http://lists.w3.org/Archives/Public/uri/1997Apr/
> > and neighboring directories.
> > >The BIND development team is also looking at the problem. Adding
> > >support for Unicode URI's is not an easy task, but it is something that
> > >needs to be done and is long overdue.
> > We provided source code, which you can find in the archive, for the
> > rather simple process of encoding and decoding Unicode URIs.
> Can you point us to where this code might be found? I poked around the
> archive for 10 minutes or so, and wasn't able to come up with it.
> Suzanne Topping
> BizWonk Inc.
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:15 EDT