Re: What is the principle?

From: Mark Davis (mark.davis@jtcsv.com)
Date: Wed Mar 31 2004 - 20:01:43 EST

Next message: fantasai: "Re: Fixed Width Spaces (was: Printing and Displaying DependentVowels)"

Previous message: Kenneth Whistler: "RE: Unicode 4.0.1 Released"
In reply to: Peter Kirk: "Re: What is the principle?"
Next in thread: Rick McGowan: "Re: What is the principle?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

comments below.

Mark
__________________________________
http://www.macchiato.com
► शिष्यादिच्छेत्पराजयम् ◄

----- Original Message -----
From: "Peter Kirk" <peterkirk@qaya.org>
To: "Mark Davis" <mark.davis@jtcsv.com>
Cc: <unicode@unicode.org>
Sent: Wed, 2004 Mar 31 19:15
Subject: Re: What is the principle?

> On 31/03/2004 14:27, Mark Davis wrote:
>
> >While I disagree with most of what you've said on this list, it is not an
> >unreasonable proposal to change the default properties for some ranges of the
> >private use blocks. I don't think that this would, in practice, really
disturb
> >any applications, because of #1 below.
> >
> >I have, however, a few observations.
> >
> >1. PUA properties, as is clear from Ken's excellent descriptions, are simply
> >defaults. With the exception of normalization, no Unicode implementation is
> >required to observe them. So even if this change is made, any conformant
> >implementation is free to simply ignore it and just assign its own
properties.
> >This would not be a magic wand.
> >
> >
>
> Understood. But I was rather thinking that at least some implementations
> base their character properties directly on the Unicode character
> database. Isn't this what ICU does? And so, if the PUA default
> properties are the ones in the UCD, they would automatically be used by
> implementations.

Yes, some do (and ICU does pick up the default). Just pointing out that
implementations can freely choose the properties (except normalization).

BTW, you have been mentioning the combining class; you can have combining marks
in the PUA, but they have to have zero combining classes.

>
> >2. Unicode properties are not sufficient for rendering. With technologies
such
> >as Apples, all of the other work can be done in a font. With OpenType, most
but
> >not all can -- in particular, reordering has to be done by the
application/OS.
> >So complex scripts that require reordering still would not be interchangeable
> >without private agreement.
> >
> >
>
> This is why the suggestions made for storing character properties in the
> font are unrealistic; they require major restructuring of system
> software (close to rewriting the whole OS, as I wrote earlier), not just
> tinkering. I accept that there may be some practical limitations on PUA
> complex scripts, but I would like them to be a lot less than they are now.

ANY dynamic reassignment of properties requires a major overhaul. There have
been proposals over the years for exchange of PU property data. All of them have
died, and I never expect to see any succeed.

The reason is that most implementations just get properties with static calls,
e.g. isLetter(x). To change it to be dynamic, all of these calls in all programs
would have to be changed to reference a dynamic collection of properties. In a
single-threaded world, this wouldn't be too bad. But that is not our world --
which is a multi-threaded world -- there it is nasty; and horrible if the same
document is expected to contain different sets of PU properties. There are also
performance implications, since properties are used so heavily in processing.

These are not whims of software vendors; they would be very expensive retrofits
for essentially no benefit.

>
> >3. Even excluding the normalization properties and other obvious inapplicable
> >properties (such as name or age), there are some 50-odd possible character
> >properties, many of them with multiple possible values: see
> >
> >http://www.unicode.org/Public/UNIDATA/PropertyAliases.txt
> >http://www.unicode.org/Public/UNIDATA/UCD.html#Properties
> >http://www.unicode.org/Public/UNIDATA/PropertyValueAliases.txt
> >
> >A concrete proposal would have to specify exactly which properties were
> >relevant, and what the values are for the proposed ranges. (Clearly an even
> >partition according to all the possible combinations would be completely
> >impractical.) If the goal is rendering, this means looking at the possible
> >combinations of properties that are relevant for rendering and proposing a
> >division that makes sense.
> >
> >
>
> That is why I (rather than Ernest) have discussed only rendering related
> properties like bidi and default ignorable. I realise that there may be
> other properties which need to be considered, but I am not yet sure
> which these are.

Those alone won't work. If you want stuff to render right, then you have to
include *any* property that systems may use to affect display. You do want these
characters to linebreak correctly, eh? That's why I said that a complete
proposal would have to spell out all the properties would be considered, and
give reasons for the inclusion/exclusions.

>
> I sense that you prefer to change the default properties of existing PUA
> characters rather than add new ones. Might it be sensible to adjust the
> properties in one of the PUA planes but leave the other one untouched?
> Has ANYONE actually defined characters in one or other of these planes,
> and if so, which? It would make more sense to change the default
> properties of a plane which no one is actually using.

1. There is no way I would advocate adding even more PU characters; the number
we have is wasteful as it is. (In hindsight, we shouldn't have gone beyond
U+FFFFF in any event.)

2. If you are going to make this proposal, I'd suggest using a small part of one
plane, probably at the high end.

>
> >Mark
> >__________________________________
> >http://www.macchiato.com
> >► शिष्यादिच्छेत्पराजयम् ◄
> >
> >
> >
>
>
> --
> Peter Kirk
> peter@qaya.org (personal)
> peterkirk@qaya.org (work)
> http://www.qaya.org/
>
>

Next message: fantasai: "Re: Fixed Width Spaces (was: Printing and Displaying DependentVowels)"
Previous message: Kenneth Whistler: "RE: Unicode 4.0.1 Released"
In reply to: Peter Kirk: "Re: What is the principle?"
Next in thread: Rick McGowan: "Re: What is the principle?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Wed Mar 31 2004 - 20:41:51 EST