Re: What is the principle?

From: Peter Kirk (peterkirk@qaya.org)
Date: Tue Mar 30 2004 - 05:38:36 EST

  • Next message: Peter Kirk: "Re: Printing and Displaying Dependent Vowels"

    On 29/03/2004 15:14, Kenneth Whistler wrote:

    >Peter Kirk responded:
    >
    >
    >
    >>>Third, the proposal to "transfer ... some or all of the Variation
    >>>Selectors on the SSP to Private Use" is unclear on the concept of
    >>>Private Use. The UTC will make *no* semantic encoding commitment
    >>>regarding what a private use character is to be used for. That would
    >>>include *not* specifying that some range of Private Use characters
    >>>be dedicated to use as variation selectors (privately defined). ...
    >>>
    >>>
    >>>
    >>>
    >>The problem here is that, despite what you say, the UTC has already
    >>specified the character properties of all of the existing PUA
    >>characters,
    >>
    >>
    >
    >No. The UTC has specified default values for the properties
    >for all code points, including the PUA, to prevent implementers
    >of property API's from having them blow up or return random
    >values for such code points.
    >
    >
    >
    Default values are still values, for whatever reason they have been
    specified. If one default value can be specified, so can a different one.

    >>in a way which rules out their use as variation selectors,
    >>or as combining marks, or as right-to-left characters.
    >>
    >>
    >
    >They do not. A user of PUA characters is free to define the
    >whole range of PUA characters as consisting of strong R-to-L
    >characters and implementing accordingly. ...
    >

    This is not true! Users can define only those properties which the
    software that they are using allows them to define. Your argument here
    completely ignores the distinction between users and software
    developers. You may have the luxury of being able to do both. But the
    vast majority of users depend on the software systems and applications
    provided by large corporate software companies. (Software written by
    smaller companies generally uses rendering engines, character processing
    etc provided by the large companies.) These large companies are mostly
    members of the Unicode consoritum. They are also overwhelmingly western,
    mostly American, and so inherently biased in favour of LTR scripts
    without combining marks. This bias is reflected in the "default"
    properties assigned to PUA characters, by their majority vote, and their
    refusal to contemplate changes. This bias is also reflected in their
    system software which (as far as I know with no exceptions) does not
    allow users to specify properties for PUA characters other than the
    default decided by the UTC.

    >... I have, for example, for
    >my own internal use for developing collation tables, defined
    >U+F8F0..U+F8F4 as being non-spacing combining marks (with no
    >display), for use in providing variant weights. These are, in
    >fact, very, very similar to what you are advocating for here as
    >specialized variant selectors. But I do so by my own *PRIVATE*
    >use of those characters, with code that assigns my own *PRIVATE*
    >semantics and operates accordingly. I don't expect, in this
    >case, to even require interoperability with anyone else, because
    >the usage is internal and what matters are the weighted outputs
    >in the tables. But in principle I could interchange this usage
    >with someone else who chose to also treat U+F8F0..U+F8F4 as
    >non-spacing combining marks with no display, indicating variant
    >forms.
    >
    >The problem you are having is that you (and most implementers) are
    >dependent on how the underlying *platform* treats the PUA, and
    >have not been given an API which makes it easy to specific
    >specific character properties along with your PUA character
    >assignments.
    >
    >
    >
    At least you understand the problem which totally undermines your
    argument here.

    >>As an alternative to adjusting the definitions of the existing variation
    >>selectors, might it be possible for the UTC to adjust the character
    >>properties of parts of the Supplementary Private Use Areas?
    >>
    >>
    >
    >If you are asking me, I'd say the answer is no.
    >
    >
    >
    >>For example,
    >>a range of characters could be defined as default ignorable, default
    >>collation weight [.0000.0000.0000.0000] etc., and so these could be used
    >>as private variation selectors, or as private diacritical marks (which
    >>would simply disappear if viewed with a regular font; they would be in
    >>combining class 0 and so there would be no normalisation issues); and
    >>another range could be defined as RTL; and whatever other ranges might
    >>be required.
    >>
    >>
    >
    >You can do it privately. See above. But attempting to do such things
    >in terms of formally specified usages of the PUA is an invitation
    >to failure of interoperability.
    >
    >

    I don't understand this last comment. But anyway we don't want
    interoperability because we know that the PUA is not intended for
    interchange of data, or should be used for this at most for between
    small consenting communities who may have to use the same software. What
    we do want is compatibility between our applications and the system
    software, and this proposal is the way to do that.

    >
    >
    >>Alternatively, an additional PUA could be defined to avoid
    >>changing the properties of existing characters.
    >>
    >>
    >
    >This also won't happen. In my assessment the UTC is just dead set against
    >trying to create this kind of mechanism through proliferating types
    >of PUA spaces.
    >
    >
    >
    >>This cannot be in
    >>conflict with the principle that "The UTC will make *no* semantic
    >>encoding commitment regarding what a private use character is to be used
    >>for" because these kinds of properties have already been specified by
    >>the UTC for the existing PUA.
    >>
    >>
    >
    >Nope. You're wrong. A default value for a property is not a
    >requirement by the UTC regarding what a PUA character can or may
    >or must be used for.
    >
    >
    >
    Yes. If a default value is not a requirement, then a CHANGE to a default
    value is not a requirement. You have no good reason not to make a change
    to the default value for some PUA characters.

    I see the point about not proliferating separate PUA spaces. But that is
    the only argument I see on your side. Perhaps the UTC will be less dead
    set against this if the arguments are realised, and perhaps if the few
    non-western UTC members realise how the process is biased against the
    languages of their countries.

    >--Ken
    >
    >
    >
    >
    >
    >
    >

    -- 
    Peter Kirk
    peter@qaya.org (personal)
    peterkirk@qaya.org (work)
    http://www.qaya.org/
    


    This archive was generated by hypermail 2.1.5 : Tue Mar 30 2004 - 06:26:16 EST