Re: SSP default ignorable characters, was: Variation selectors and vowel marks

From: Doug Ewell (dewell@adelphia.net)
Date: Mon Apr 26 2004 - 11:35:16 EDT

Next message: Ernest Cline: "RE: Proposal to add 2 Romanian characters"

Previous message: Peter Constable: "RE: Proposal to add 2 Romanian characters"
In reply to: Peter Kirk: "Re: SSP default ignorable characters, was: Variation selectors and vowel marks"
Next in thread: Peter Kirk: "Re: SSP default ignorable characters, was: Variation selectors and vowel marks"
Reply: Peter Kirk: "Re: SSP default ignorable characters, was: Variation selectors and vowel marks"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Peter Kirk <peterkirk at qaya dot org> wrote:

>> ... And if you say, well, this won't work because Microsoft Word and
>> Internet Explorer and other tools and vendors don't let me override
>> the default PUA properties, I reply: do you really think they will be
>> any quicker to support this new PUA block?
>
> Yes, because the whole point of the definition of this block of
> characters as default ignorable is that implementations are ALREADY
> supposed to ignore these code points in processing and display, even
> before they are defined as characters. I would expect the latest
> versions of Unicode compatible tools to treat these code points as if
> they were already defined default ignorable characters.

You could try going ahead and writing a proposal to carve out part of
the existing DI block as another private-use area. I suppose I know
what the response will be: (1) we already gave you 137,000 private-use
code points, what do you need more for? (2) if you say you need a new DI
PUA, next somebody will want one for RTL, one for combining marks, one
for font control, etc. etc., and (3) we don't want to be in the business
of assigning properties to PUA characters anyway; the *default*
properties we assigned are intended to be overridable by private
agreement.

The fact that Uniscribe and other rendering engines apply the *default*
properties to all PUA code points, and provide no mechanism to modify
them, is a fault in the rendering engines, although probably not a
high-priority one in vendors' eyes.

The Principles and Procedures document says that getting around
short-term deficiencies in rendering technology is explicitly *not* a
reason to create new characters, so I doubt it will be seen as a reason
to create new private-use areas either.

By far the most popular use of the PUA thus far has been as an ad-hoc
glyph registry for technologies (or people) that regard code points and
glyphs as 1-to-1. Very few people have tried to use the PUA for some
purpose that the default properties don't handle. That doesn't mean a
more flexible solution shouldn't be developed, but it does explain why
the big vendors haven't bothered developing one.

> On the other
> hand, if I define my own PUA characters as default ignorable, I can
> expect my private definitions NEVER to be supported by standard
> software, because I can't make private agreements with Microsoft or
> other significant software providers, although it is of course not
> impossible that someone somewhere some time just might write software
> which allows users to specify their own properties for PUA characters.

This is really the mechanism that is needed. Maybe I'll try outlining a
possible solution. If a small plug-in solution can be proven to the big
vendors to be low-cost and effective, who knows what they might say?

> OK, if I were a hacker I might be able to hack open source software,
> but if I were a hacker I would find easier ways of hacking my
> requirements into Unicode.

I'm sure the open-source people would rather you spoke of "programming"
or "developing" rather than "hacking." I do both regularly (no
cracking, though) and trust me, they are not the same.

> Doug, just be happy that your own private script is LTR with no
> combining characters, and so can be supported in the PUA. It seems
> that, in practice if not in principle, the PUA is restricted to such
> scripts.

Actually this has nothing to do with my script, which IE doesn't display
properly anyway (it breaks lines arbitrarily between any two
characters).

-Doug Ewell
Fullerton, California
http://users.adelphia.net/~dewell/

Next message: Ernest Cline: "RE: Proposal to add 2 Romanian characters"
Previous message: Peter Constable: "RE: Proposal to add 2 Romanian characters"
In reply to: Peter Kirk: "Re: SSP default ignorable characters, was: Variation selectors and vowel marks"
Next in thread: Peter Kirk: "Re: SSP default ignorable characters, was: Variation selectors and vowel marks"
Reply: Peter Kirk: "Re: SSP default ignorable characters, was: Variation selectors and vowel marks"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Mon Apr 26 2004 - 12:10:36 EDT