Re: Wanted: synonyms for Age

From: Mark Davis ⌛ (
Date: Fri Aug 07 2009 - 15:59:48 CDT

  • Next message: Asmus Freytag: "Re: Matching opening and closing characters: How?"

    *We do define *Reserved Code Point* = *Unassigned Code Point = **Undesignated
    Code Point.*

    These are different from:

    *Unassigned Character*. Synonym for *not assigned to an abstract character*.
    This refers to surrogate code points, noncharacters, and reserved code
    points. (See Section 2.4, Code Points and


    On Fri, Aug 7, 2009 at 12:50, karl williamson <>wrote:

    > I forgot to include the public list as a cc to this, which I am now doing,
    > but perhaps it is better, as I realize that I'm confused about what reserved
    > means. I thought from NamesList.txt that reserved characters were
    > unassigned ones that were never going to be assigned because of some
    > constraint on them, such as being place-holders. Like the following:
    > 1D51D <reserved>
    > x (black-letter capital z - 2128)
    > where the code points around it are assigned, but this one essentially
    > duplicates 2128, and so is skipped.
    > But in looking at extracted/DerivedGeneralCategory.txt, it appears that
    > reserved is any Cn code point that isn't a non-character.
    > karl williamson wrote:
    >> Kenneth Whistler wrote:
    >>> Karl Williamson wrote:
    >>> ... I thought I should add some things I've been thinking about to make
    >>>> sure I understand. Feel free to correct me.
    >>>> Each Unicode property is defined on a subset of the Unicode code points.
    >>>> Many are defined on the complete set, but some are not, such as Name, as
    >>>> for example, surrogates and private use code points have no name.
    >>> Actually Name *is* defined on the complete set. The values for
    >>> the Name property are strings, and for reserved code points
    >>> (and some other code point types), the value of the Name property
    >>> is the null string.
    >>> Since this has been confusing to a lot of people, the Unicode 5.2
    >>> text about Unicode character names has been substantially updated
    >>> to clarify this. See Section 4.8 Name--Normative in the Chapter 4
    >>> pdf posted for review. (Accessible from the Unicode 5.2 beta
    >>> page.)
    >> It was helpful looking at the 5.2 draft. But it brought up another
    >> question. I don't see anywhere in the UCD (except in NamesList.txt) any
    >> mention of reserved code points. I don't see any way to distinguish between
    >> these and code points that are otherwise unassigned, and not permanently
    >> non-characters. Perhaps it is thought that that information is not
    >> relevant, but the draft mentions "reserved-NNNN" as a possible identifying
    >> string for such a code point. Again, perhaps it is assumed that only in the
    >> text of the standard would anyone wish to make this distinction.
    >> It's unclear to me if in releases before the Unknown property value was
    >>>> added to the Script property, what the definition was, if any, of code
    >>>> points that didn't have any other of the Script property values (and
    >>>> similarly for a number of other catalog properties).
    >>> The issue of default values is explained now in more detail
    >>> in Section 4.2.8 Default Values in UAX #44. See the Unicode 5.2
    >>> proposed update:
    >>> As far as the default value of the Script property is concerned,
    >>> before Script=Unknown was introduced, the Scripts.txt file itself
    >>> defined Script=Common as the default value.
    >> I had overlooked this. But there are other examples in which there at one
    >> time was no default value given, but now there is, like NaN for numeric
    >> value. Was the default the null string for earlier releases, or was it just
    >> undefined?
    >> [snip]

    This archive was generated by hypermail 2.1.5 : Fri Aug 07 2009 - 16:01:26 CDT