RE: numeric properties of Nl characters in the UCD

From: Philippe Verdy (
Date: Tue Nov 25 2003 - 17:49:01 EST

  • Next message: Philippe Verdy: "RE: Compression through normalization"

    There's still my unanswered question about the third numeric field not
    filled for some numeric characters (notably Nl characters, i.e. number

    I accepted the fact of being unable to define it for the "numerator one less
    than the denominator", but the Latin Roman number 900 has NO defined numeric
    value, and I don't see why. I would accept a rationale based on contextual
    meaning of the number, where its actual value changed between sources, but I
    don't think that the Roman 900 number letter has another possible value than

    As the first reason why scripts have been standardized, learned and tought
    during the history is its use for accounting purpose, I doubt that merchants
    would have accepted an ambiguous meaning of these characters. If this ever
    occured in some local cultures, which brought a foreign glyph in their
    script, the use of the glyph creates a new abstract character that merits
    another name in Unicode and other properties.

    So I suggest you load the UCD in any spreadsheet, sort it on the general
    category column, and look at the numeric characters (third column starting
    by N):
    - all "Nd" characters should have their 3 numeric defined equally between 0
    and 9,
    - all "Ni" characters should have only their last two fields set equally
    with a simple integer, and
    - all "Nl" characters should have something set in the third field only
    (except possibly for the "numerator one less than the denominator"
    character, which could have its own "No" category for "Numeric, other".)


    -----Message d'origine-----
    De : []De la
    part de Mark Davis
    Envoyé : mardi 25 novembre 2003 20:10
    À : Arcane Jill;
    Objet : Re: numeric properties of Nl characters in the UCD

    The fields are the way they are for backwards compatibility. If you look at
    the UCD.html, you will see that the actual properties are separated:

    I'd like to remind people again that you should read the documentation in
    UCD.html before trying to make sense of the raw data files.

    ? ????????????????????? ?
    ----- Original Message -----
    From: Arcane Jill
    Sent: Tue, 2003 Nov 25 02:42
    Subject: RE: numeric properties of Nl characters in the UCD

    Actually, I don't understand why UnicodeData.txt has no less than three
    different fields for numerical value anyway. I mean, it's not as though
    there exists EVEN A SINGLE CODEPOINT for which two or more of these fields
    exist and are defined differently from each other. One never sees, for
    example, a character for which "digit value" is 3 and "numeric value" is 4.
    It seems to me that one single numeric field would suffice.

    You may need a second field to establish what "kind" of number this is
    (decimal digit, whatever), but then maybe you could figure that out from the
    general category anyway.


    > -----Original Message-----
    > From: Philippe Verdy []
    > Sent: Sunday, November 23, 2003 2:58 AM
    > To: Unicode@Unicode.Org
    > Subject: numeric properties of Nl characters in the UCD
    > I do understand why number letter characters with "Nl"
    > general category
    > don't have a "decimal value" property or a "integer value"
    > property, but why
    > they don't all have a "numeric value" property in the UCD.

    << ella for Spam Control >> has removed Spam messages and set aside
    Newsletters for me
    You can use it too - and it's FREE!

    This archive was generated by hypermail 2.1.5 : Tue Nov 25 2003 - 18:30:57 EST