Re: SSP default ignorable characters, was: Variation selectors and vowel marks

From: Peter Kirk (
Date: Mon Apr 26 2004 - 04:57:12 EDT

  • Next message: "Re: Proposal to add 2 Romanian characters"

    On 25/04/2004 17:19, Doug Ewell wrote:

    >Peter Kirk <peterkirk at qaya dot org> wrote:
    >>I wonder if it would be possible to set aside part of the SSP block of
    >>default ignorable characters as a private use area? These can then be
    >>used for private use combining marks (which would have to have
    >>combining class zero, but consistent ordering of PUA marks should be
    >>the PUA user's responsibility), which would simply be ignored by fonts
    >>which don't support them which is the best fallback for combining
    >>marks. Or they could be used as private use variation selectors, dare
    >>I suggest it?
    >You may, but people might start calling you "William." ☺
    >>I would suggest perhaps a block of 256 characters, which would allow
    >>those who choose to do so to use this for some kind of invisible
    >>annotation. Obviously these are not really plain text issues; but then
    >>the point of a private use area is to allow people to do things which
    >>are not standardised. I suspect that allowing this kind of thing will
    >>be a good way of getting many people off the backs of the UTC!
    >Instead of asking for a new block of private-use code points that are
    >default-ignorable, why not define your private-use characters out of the
    >existing blocks and define THEM as default-ignorable? You have every
    >right to do so; it is the essence of a private agreement to give your
    >PUA characters properties as well as glyphs. And if you say, well, this
    >won't work because Microsoft Word and Internet Explorer and other tools
    >and vendors don't let me override the default PUA properties, I reply:
    >do you really think they will be any quicker to support this new PUA

    Yes, because the whole point of the definition of this block of
    characters as default ignorable is that implementations are ALREADY
    supposed to ignore these code points in processing and display, even
    before they are defined as characters. I would expect the latest
    versions of Unicode compatible tools to treat these code points as if
    they were already defined default ignorable characters. On the other
    hand, if I define my own PUA characters as default ignorable, I can
    expect my private definitions NEVER to be supported by standard
    software, because I can't make private agreements with Microsoft or
    other significant software providers, although it is of course not
    impossible that someone somewhere some time just might write software
    which allows users to specify their own properties for PUA characters.

    OK, if I were a hacker I might be able to hack open source software, but
    if I were a hacker I would find easier ways of hacking my requirements
    into Unicode.

    Doug, just be happy that your own private script is LTR with no
    combining characters, and so can be supported in the PUA. It seems that,
    in practice if not in principle, the PUA is restricted to such scripts.

    Peter Kirk (personal) (work)

    This archive was generated by hypermail 2.1.5 : Mon Apr 26 2004 - 05:39:34 EDT