Re: Wanted: synonyms for Age

From: karl williamson (
Date: Fri Aug 07 2009 - 14:50:52 CDT

  • Next message: Hans Aberg: "Re: Matching opening and closing characters: How?"

    I forgot to include the public list as a cc to this, which I am now
    doing, but perhaps it is better, as I realize that I'm confused about
    what reserved means. I thought from NamesList.txt that reserved
    characters were unassigned ones that were never going to be assigned
    because of some constraint on them, such as being place-holders. Like
    the following:

    1D51D <reserved>
            x (black-letter capital z - 2128)

    where the code points around it are assigned, but this one essentially
    duplicates 2128, and so is skipped.

    But in looking at extracted/DerivedGeneralCategory.txt, it appears that
    reserved is any Cn code point that isn't a non-character.

    karl williamson wrote:
    > Kenneth Whistler wrote:
    >> Karl Williamson wrote:
    >>> ... I thought I should add some things I've been thinking about to
    >>> make sure I understand. Feel free to correct me.
    >>> Each Unicode property is defined on a subset of the Unicode code
    >>> points. Many are defined on the complete set, but some are not,
    >>> such as Name, as for example, surrogates and private use code points
    >>> have no name.
    >> Actually Name *is* defined on the complete set. The values for
    >> the Name property are strings, and for reserved code points
    >> (and some other code point types), the value of the Name property
    >> is the null string.
    >> Since this has been confusing to a lot of people, the Unicode 5.2
    >> text about Unicode character names has been substantially updated
    >> to clarify this. See Section 4.8 Name--Normative in the Chapter 4
    >> pdf posted for review. (Accessible from the Unicode 5.2 beta
    >> page.)
    > It was helpful looking at the 5.2 draft. But it brought up another
    > question. I don't see anywhere in the UCD (except in NamesList.txt) any
    > mention of reserved code points. I don't see any way to distinguish
    > between these and code points that are otherwise unassigned, and not
    > permanently non-characters. Perhaps it is thought that that information
    > is not relevant, but the draft mentions "reserved-NNNN" as a possible
    > identifying string for such a code point. Again, perhaps it is assumed
    > that only in the text of the standard would anyone wish to make this
    > distinction.
    >>> It's unclear to me if in releases before the Unknown property value
    >>> was added to the Script property, what the definition was, if any, of
    >>> code points that didn't have any other of the Script property values
    >>> (and similarly for a number of other catalog properties).
    >> The issue of default values is explained now in more detail
    >> in Section 4.2.8 Default Values in UAX #44. See the Unicode 5.2
    >> proposed update:
    >> As far as the default value of the Script property is concerned,
    >> before Script=Unknown was introduced, the Scripts.txt file itself
    >> defined Script=Common as the default value.
    > I had overlooked this. But there are other examples in which there at
    > one time was no default value given, but now there is, like NaN for
    > numeric value. Was the default the null string for earlier releases, or
    > was it just undefined?
    >> [snip]

    This archive was generated by hypermail 2.1.5 : Fri Aug 07 2009 - 14:54:07 CDT