Re: An attempt to focus the PUA discussion [long]

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Fri Apr 30 2004 - 20:40:14 EDT

  • Next message: Ernest Cline: "Re: An attempt to focus the PUA discussion [long]"

    ----- Original Message -----
    From: "Ernest Cline" <ernestcline@mindspring.com>
    To: "Kenneth Whistler" <kenw@sybase.com>; <peterkirk@qaya.org>
    Cc: <unicode@unicode.org>; <kenw@sybase.com>
    Sent: Saturday, May 01, 2004 1:42 AM
    Subject: Re: An attempt to focus the PUA discussion [long]

    >
    > > [Original Message]
    > > From: Kenneth Whistler <kenw@sybase.com>
    > >
    > > On the other hand, I could not expect any software doing
    > > Unicode normalization to pay any attention to *my* interpretation
    > > of those equivalences, and if I really wanted to process data
    > > using such equivalences, it would be up to me to write the
    > > software to do so.
    >
    > Decompositions and canonical combining classes are the
    > two things that affect normalization, and are why Unicode
    > limits changes to these two to be made only in an upwardly
    > compatible manner. This is what makes assigning those
    > properties to private use characters so tricky.

    As far as I know, the stability of normalization is important only for
    interchange of data using and assuming the same standard Unicode conventions.
    This is not fondamental for PUAs which are used with private conventions, using
    agreements between users so that they can at the same time use their own
    normalization.

    Stibility of PUAs will be guaranteed only for applications that don't handle
    PUAs or treat them with the Unicode default properties. If someone needs to
    assign new diacritics or now decomposable characters or new precomposed
    characters in PUAs, and handle them with their own normalization, this should be
    OK.

    After all, this is what many fonts do everyday: they assign internally some
    codes to create ligatures or recognize variant forms, and these new private
    "characters" are internally mapped to PUAs, using their own normalizations. As
    the resulting string of reordered and rearranged glyphs will not be interchanged
    but only used locally to render a text graphically, this already falls within
    what is allowed in PUAs.

    These fonts (and the text layout engines that use them) don't care about the
    normative default properties of PUAs as they really use them with the properties
    they want (joining types, case mappings for special styles like SmallCaps,
    mirrored characters, bidirectional properties, etc... are freely changed from
    the default assignment in Unicode, and GSUB tables can also be viewed as a
    normalization step performed by renderers to translate a series of standard
    Unicode points into a string of glyph ids, whose value generally match the
    standard code point to represent or a PUA codepoint).

    The default combining class 0 of PUAs is necessary in Unicode so that an
    application that does not know their contextual semantic will not attempt to
    reorder them through the standard normalization algorithm. But I don't think
    there's a limitation for applications that would use PUAs contextually, using
    other combining classes.

    So for me all PUAs can be decomposable and reorderable within the private
    convention that defines a private semantic for them, and it's not the
    responsability of Unicode to forbid it, and Unicode does not need to inspect
    what is in such private convention.



    This archive was generated by hypermail 2.1.5 : Fri Apr 30 2004 - 21:12:00 EDT