From: Language Analysis Systems, Inc. Unicode list reader (Unicode-mail@las-inc.com)
Date: Thu Apr 29 2004 - 17:04:46 EDT
>One significant thing they want to do with them is to use them to
represent the existing PUA areas in non-
>Unicode CJK standards, many of which have various overlapping de facto
assignments to Han characters.
I know. I meant to include this under "stuff which isn't in Unicode yet
but probably will be."
>> ...it starts to cut
>> down significantly on the code points available for actual
>> standardization.
>
>This last point isn't very strong. Nobody who's actually considered
the problem thinks we are going to get
>past plane 3; all counterarguments known to me are of the form "Finite
allocations are always too small in
>principle; we have overrun ASCII and IP version 4 and so on, and so we
need indefinite extensibility."
>This argument just doesn't apply to characters, short of joining the
Galactic Empire.
Fair enough. I don't like it for purely aesthetic reasons-- it's ugly
and wasteful, but I don't expect to convince anyone with that. This is
the best argument I can come up with. Are you in favor of setting aside
more PUA space, or do you have a better argument against it?
>The obvious problem is that there is no way to force something to be a
combining character of class X.
>Allocating 256 marker characters in Plane 14 would solve this, but
probably at an unacceptable cost in
>implementation complexity.
This is where I'm really confused. Why would anyone want to do this?
Combining character classes are used only in normalization, and the one
hard-and-fast thing about the PUA is that PUA characters can't
participate in normalization. You can still have combining characters
in the PUA with the current default properties-- that can usually be
done entirely within a font. Treating characters as equivalent for the
purposes of equality comparison can be done (theoretically, anyway) with
tailored collation orders. What else is there?
I'm going to take a wild guess and say that the reason this is an issue
is because people want to have variation selections for combining marks,
and have them work right even in the presence of normalization. It does
kind of seem like there's a need here, but let's discuss THAT problem
rather than having the whole thing degenerate into long discussions
about how the default PUA properties discriminate against certain
classes of users. I don't think this is a problem you can solve with
the PUA, although I do think you can work around it with the PUA.
--Rich Gillam
Language Analysis Systems, Inc.
This archive was generated by hypermail 2.1.5 : Thu Apr 29 2004 - 18:02:06 EDT