Re: Regulating PUA.

From: Richard Wordingham (
Date: Sun Jan 21 2007 - 15:39:19 CST

  • Next message: Michael Maxwell: "RE: Proposing UTF-21/24"

    Mike wrote on Sunday, January 21, 2007 6:56 PM

    > When I implemented collation, I needed to define code points for
    > the various contractions that can occur. To avoid clashing with
    > any private use code points, I chose to start allocating the con-
    > tractions at 0x110000. This has worked quite nicely.

    One problem with that solution is that it may work if you're working with
    extensions of UTF-8 or extensions of UTF-32, but just doesn't work with
    UTF-16. The other is that with the other two, especially extending UTF-8,
    you are quite likely to fall foul of defensive code guarding against
    impossible codepoints. It's a shame, for I had been about to suggest it.

    There actually already is a division of the PUA in the BMP - the low end is
    for end users and the high end is for system vendors and software
    developers. What is lacking is a definition of when the boundary lies.
    This principle seems to be generally followed. Of course, there is nothing
    to stop end-users clashing - they will depend on fonts to keep the character

    The big problem is 'agreements' which are more offers one cannot refuse.
    There is probably no way round this.


    This archive was generated by hypermail 2.1.5 : Sun Jan 21 2007 - 15:41:47 CST