RE: numeric properties of Nl characters in the UCD

From: Philippe Verdy (
Date: Wed Nov 26 2003 - 09:40:10 EST

    Michael Everson writes:
    > >But why do U+10341 [GOTHIC LETTER NINETY] and U+1034A [GOTHIC LETTER NINE
    > >HUNDRED], which are letters that are only ever used to represent the
    > >numbers 90 and 900 respectively (they have no intrinsic phonetic
    > >value), not have a numeric value assigned to them?
    > Because there's no particular value in doing so.
    > The burden is on you (or whomever) to prove that there would be.
    > Otherwise, if it ain't broke, don't fix it.

    The cost of such exceptions is that an application cannot reliably use the
    general categories to detect, evaluate or create numbers in a relevant
    script. So this requires a separate table for each supported script.

    This unnecessarily complicates algorithms that support internationalized
    numeric strings, in a area where it could be very simply fixed.

    We do need that characters that have a numeric property be defined either as
    "Nd" (with three non-empty numeric properties values), or "Ni" (with two
    non-empty numeric properties values), or "Nl" (with one non-empty numeric
    properties values) or "No", i.e. "Number, Other" (with no non-empty numeric
    properties), and that NO other category than "Mn" can have non-empty numeric

    > >BTW I've just noticed that U+10341 has a general category of
    > "Lo" (Letter,
    > >Other), whereas U+1034A has a general category of "Nl" (Number,
    > Letter), which
    > >seems a little odd.
    > It does.

    And it is fixable...

