PUA properties, default or otherwise (was: Re: What is the principle?)

From: Doug Ewell (dewell@adelphia.net)
Date: Wed Mar 31 2004 - 01:38:16 EST

  • Next message: Philippe Verdy: "Re: PUA properties, default or otherwise (was: Re: What is the principle?)"

    This discussion has focused pretty tightly on the *default* properties
    of PUA code points, without really addressing the issue of specifying
    new properties to override those defaults, and I think that's a mistake.

    After all, if you're going to define Private Use characters, it really
    isn't enough to specify only the glyphs. You also need to indicate the
    type of character (letter, number, space, etc.), directionality, numeric
    properties, combining class, maybe case, and so forth. This is Unicode,
    after all, where a character is defined as much by its properties as
    anything else.

    As a reminder, I do believe in the PUA for the purpose of exchanging
    rare, personal, or ideosyncratic *TEXT CHARACTERS* in a Unicode
    environment (note to William: I still owe you a response). My invented
    script at [1] is an example: it does not, and will never, qualify for
    standardization in Unicode; but I have sent and received private e-mails
    in it, using a privately defined agreement, exactly what the PUA was
    intended for.

    The page at [2] includes a full list of Unicode properties for each
    character in my invented script. This includes not only directionality,
    but everything else that would be listed for a standardized Unicode
    character in UnicodeData.txt. (I've wanted to do a complete list like
    this for all of CSUR, but it's waaaay down on my list of priorities.)

    I could imagine someone, somewhere... possibly me... writing a Unicode
    subsystem that actually read in UnicodeData.txt (or a compiled version
    of it) and used that to derive its information about character
    properties. Now, if someone defined PUA characters and *actually went
    to the trouble of specifying their properties*, as I did in [2], and if
    this subsystem was able to use that data as an adjunct to
    UnicodeData.txt, then things would work the way Peter Kirk wants. The
    default LTR directionality and other default properties of PUA
    characters would not matter; they would be overridden. But Ken and Rick
    are absolutely right that very few companies are going to see a business
    opportunity in this. Even SC UniPad, which has implemented many
    comparatively arcane features of Unicode, has never done anything with
    the PUA, though it has been on their "future versions" list for 6 years

    [1] http://users.adelphia.net/~dewell/ewellic.html
    [2] http://users.adelphia.net/~dewell/ew-props.html

    -Doug Ewell
     Fullerton, California

    This archive was generated by hypermail 2.1.5 : Wed Mar 31 2004 - 02:15:53 EST