Re: valid characters in user names- esp. compatibility characters

From: Tex Texin (
Date: Fri Aug 13 2004 - 03:09:20 CDT

  • Next message: Peter Constable: "[mo/mol] and [ro/ron/rum]"

    Markus, Mike, Jungshik,

    Thanks for the info on the encoding schemes. I know about the ICU
    implementation too.

    My question was really directed at the issue of which characters are used or
    needed in names, and whether NFKC normalized strings are adequate. I did get
    some private assurance that they are, or at least that normalizing out
    compatibility characters would not be a problem. Whereas there are (too) many
    encoding schemes, only the IDNA specs identify characters that should not be
    used and would cause confusion. That aspect of the algorithm is what makes it
    useful for this application.

    I also heard privately that the GNU implementation was being used and that the
    current version works well.

    thanks for the feedback.


    Markus Scherer wrote:
    > Another encoding, standardized for much longer, is what IMAP uses for mailbox names. I think it does
    > not have a standard charset name, but it's described in one of the IMAP RFCs. It's a modified UTF-7,
    > modified to make it filename-friendly and deterministic, and may fit the bill. It's certainly useful
    > to squeeze Unicode filenames into ASCII. I am sure there are many libraries with an implementation
    > (ICU has one, see convrtrs.txt).
    > By the way, ICU also implements IDNA and generic StringPrep.
    > Best regards,
    > markus

    Tex Texin   cell: +1 781 789 1898
    Xen Master                
    Making e-Business Work Around the World

    This archive was generated by hypermail 2.1.5 : Fri Aug 13 2004 - 03:11:30 CDT