RE: An attempt to focus the PUA discussion [long]

From: Language Analysis Systems, Inc. Unicode list reader (
Date: Thu Apr 29 2004 - 17:04:46 EDT

  • Next message: Language Analysis Systems, Inc. Unicode list reader: "RE: An attempt to focus the PUA discussion [long]"

    >One significant thing they want to do with them is to use them to
    represent the existing PUA areas in non-
    >Unicode CJK standards, many of which have various overlapping de facto
    assignments to Han characters.

    I know. I meant to include this under "stuff which isn't in Unicode yet
    but probably will be."

    >> starts to cut
    >> down significantly on the code points available for actual
    >> standardization.
    >This last point isn't very strong. Nobody who's actually considered
    the problem thinks we are going to get
    >past plane 3; all counterarguments known to me are of the form "Finite
    allocations are always too small in
    >principle; we have overrun ASCII and IP version 4 and so on, and so we
    need indefinite extensibility."
    >This argument just doesn't apply to characters, short of joining the
    Galactic Empire.

    Fair enough. I don't like it for purely aesthetic reasons-- it's ugly
    and wasteful, but I don't expect to convince anyone with that. This is
    the best argument I can come up with. Are you in favor of setting aside
    more PUA space, or do you have a better argument against it?

    >The obvious problem is that there is no way to force something to be a
    combining character of class X.
    >Allocating 256 marker characters in Plane 14 would solve this, but
    probably at an unacceptable cost in
    >implementation complexity.

    This is where I'm really confused. Why would anyone want to do this?
    Combining character classes are used only in normalization, and the one
    hard-and-fast thing about the PUA is that PUA characters can't
    participate in normalization. You can still have combining characters
    in the PUA with the current default properties-- that can usually be
    done entirely within a font. Treating characters as equivalent for the
    purposes of equality comparison can be done (theoretically, anyway) with
    tailored collation orders. What else is there?

    I'm going to take a wild guess and say that the reason this is an issue
    is because people want to have variation selections for combining marks,
    and have them work right even in the presence of normalization. It does
    kind of seem like there's a need here, but let's discuss THAT problem
    rather than having the whole thing degenerate into long discussions
    about how the default PUA properties discriminate against certain
    classes of users. I don't think this is a problem you can solve with
    the PUA, although I do think you can work around it with the PUA.

    --Rich Gillam
      Language Analysis Systems, Inc.

    This archive was generated by hypermail 2.1.5 : Thu Apr 29 2004 - 18:02:06 EDT