RE: Regulating PUA.

From: Peter Constable (petercon@microsoft.com)
Date: Mon Jan 22 2007 - 18:02:25 CST

  • Next message: Mark Davis: "Re: Regulating PUA."

    Interesting. This isn’t obvious to me, though perhaps the character property model makes this clear. I presume you say #3 isn’t permitted because the stability policy includes this constraint wrt Noncharacter_Code_Point:
     
    Unicode 3.1+
    The Noncharacter_Code_Point property is an immutable code point property, which means that its property values for all Unicode code points will never change
     
    The reason this isn’t obvious to me is that it’s not clear if Noncharacter_Code_Point as a property is perceived as uni-valued – i.e. a set that does not necessarily partition the code space – or as a binary-valued property that partitions the code space. If it is the former, then the open question is whether that is an open set to which new code points can be added.
     
    If this is spelled out, where is it spelled out?
     
     
    Peter
     
    ________________________________

    From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org] On Behalf Of Mark Davis
    Sent: Monday, January 22, 2007 2:05 PM
    To: Eric Muller
    Cc: Ruszlan Gaszanov; unicode@unicode.org
    Subject: Re: Regulating PUA.
     
    One further correction. The number of noncharacter code points is limited according to the Unicode stability policies (http://www.unicode.org/standard/stability_policy.html ), so #3 is also not permitted. However, if people needed a larger range of process-internal values, and didn't want to use private use codes, there is nothing to prevent them from using sequences of non-character code points.

    Mark
    On 1/21/07, Mark Davis <mark.davis@icu-project.org> wrote:
    As Eric said, this is already provided for.
    1. There are already 66 code points available for process-internal use, called noncharacters (see below)
    2. It would be backwards incompatible for the consortium to make ANY change in PUA characters. There is, IMO, essentially zero chance of this happening. So it is not worth discussing any further.
    3. If someone really wanted to propose additional noncharacter code points, on the other hand, that is certainly possible. (And as a reminder, NO proposal that is circulated on this list is taken up by the UTC unless a written proposal is submitted to http://www.unicode.org/reporting.html (or by unicode members via internal mechanisms).) One would have to make a very good case for the need, however.
    FDD0..FDEF
    #
    Cn
    [32]
    FFFE..FFFF
    #
    Cn
    [2]
    1FFFE..1FFFF
    #
    Cn
    [2]
    2FFFE..2FFFF
    #
    Cn
    [2]
    3FFFE..3FFFF
    #
    Cn
    [2]
    4FFFE..4FFFF
    #
    Cn
    [2]
    5FFFE..5FFFF
    #
    Cn
    [2]
    6FFFE..6FFFF
    #
    Cn
    [2]
    7FFFE..7FFFF
    #
    Cn
    [2]
    8FFFE..8FFFF
    #
    Cn
    [2]
    9FFFE..9FFFF
    #
    Cn
    [2]
    AFFFE..AFFFF
    #
    Cn
    [2]
    BFFFE..BFFFF
    #
    Cn
    [2]
    CFFFE..CFFFF
    #
    Cn
    [2]
    DFFFE..DFFFF
    #
    Cn
    [2]
    EFFFE..EFFFF
    #
    Cn
    [2]
    FFFFE..FFFFF
    #
    Cn
    [2]
    10FFFE..10FFFF
    #
    Cn
    [2]

    Mark


    On 1/21/07, Eric Muller < emuller@adobe.com <mailto:emuller@adobe.com> > wrote:
            Ruszlan Gaszanov wrote:
    > So, why don't we split the PUA into character-PUA (reserved for representing non-standard characters) and non-character-PUA (reserved for process-internal uses)?
            This problem is already solved, using noncharacters: the last two
            characters of each plane and U+FDD0..U+FDEF. See TUS 5, section 16.7,
            page 549, or TUS 4, section 15.7, page 398.
            
            Eric.
            
            



    --
    Mark



    --
    Mark



    This archive was generated by hypermail 2.1.5 : Mon Jan 22 2007 - 18:05:10 CST