RE: NO-BREAK SPACE vs SPACE plus other slight numeric input

From: Addison Phillips (
Date: Thu Mar 18 1999 - 12:58:32 EST

One objection to your previous assumption, John: if your code formats a
number and then populates the field with it, the user can easily edit the
value without removing your NBSP. Some widget implementations might convert
your NBSP to SPACE as well... so you still have to parse the value.

IMX, it is better to always normalize separators away (especially remove
whitespace, but also convert decimal separator to point and remove thousands
separators if the locale can be identified) before attempting to parse the
number. Then you can work on less trivial problems like evaluating whether
the code points are numeric and, if so, from which numbering set. For
Latin-1 languages this is relatively trivial. It's Asian "wide" equivalents
and non-Arabic numbering systems that munge up the works.

It is almost always better to avoid numeric input if you can... ::sigh::



        Addison Phillips
        Director, Globalization Services
        SimulTrans, L.L.C.
        2606 Bayshore Parkway
        Mountain View, California 94043 USA

        +1 650-526-4652 (direct telephone) (Internet email) (website)

        "22 languages. One release date."

-----Original Message-----
From: Alain []
Sent: Wednesday, March 17, 1999 10:51 PM
To: Unicode List
Subject: Re: NO-BREAK SPACE vs SPACE plus other slight numeric input

A 10:04 99-03-17 -0800, John O'Conner a écrit :
>Since my original post, I've been told that the NBSP is used for output
>only, and that French users do not typically enter numbers with spaces as
>the digit separator. That is, a French user would enter "1234,56" but not
>"1 234,56". The latter form would be used for output only. If that is true
>then my problem goes away nicely.

[Alain] That is a priori true but not a posteriori. Input of numbers would
ideally have to be decoupled from their presentation. However there are
many cases and it is easier said than done:

1. Input can come from an output which may have NBSPs and comma decimal
2. Input can be made using the alphanumeric portion of the keyboard by a
   naïve user, in which case all is possible, including case 1 (*).
3. Input can be made without any delimiter for fixed precision numbers,
   such as numbers coming from an accounting ledger output in which no
   decimal delimiter was ever to be printed, say, on a preprinted form.
4. Input can be done with the numeric keypad, in which case no need
   exists, theoretically, for caring about NBSP nor with the decimal
   delimiter, as input of pure numbers out there should not be dependent
   on presentation.

Concerning case 4, a misinterpretation of international keyboard standards
(the latest involved being ISO/IEC 9995-2 [1994] and amendment 1 to ISO/IEC
9995-7) has led keyboard implementers to interpret that the key used for
decimal delimitation was a graphic character key while it is a function key
indicating the separation between the abstract integer part of a number and
its fractionary portion. This has caused problems with many softwares in
countries where usage is multiple. In Canada where we can use both the
decimal point and the decimal comma, if the keyboard driver returned a
point it did not work in "comma-based" applications, and vice-versa in many
old applications.

To that effect ISO/IEC 9995-7 has been amended to create a symbol distinct
from point or comma to indicate to naïve implementers that this is a
function key, not a character key and a note explains all this. In the
meanwhile, any character delimiter coming from the numeric keypad should be
interpreted by smart software as if it were the function and not the
character, decoupled from what it will be on output. Input should then be
quite tolerant on that delimiter. The same could be said about spaces (or
NBSPs, or "illegal" points) that separate triads. That said, the erroneous
use of points for triad separators in France (highly taught in
franco-French schools to be incorrect usage) causes an extra problem... and
solving it by technical means is like asking for squaring the circle...
education of users for this last point is important. A point should never
be interpreted as a triad separator.

Alain LaBonté
coeditor, CAN/CSA Z243.230-1998 (Canadian locales, including French one)
project editor, ISO/IEC 9995 series of keyboard standards (**)

(*) I have problems all the time with my bank transactions over the
    Internet. Even if I use the French presentation from my bank
    (Banque Scotia), they force me to enter decimal points while I
    always use decimal commas... Fortunately I never use a point as
    a triad separator so the minor problem is that I have to do it
    again each time I do it without thinking about that detail...
    I wrote them about this but I got no answer. (; Perhaps I
    should write to their president rather than to technical support.
    Problem in user interfaces often begins with a lack of understanding
    from technical support people for which humans factors is the least
    of concerns.

(**) Coeditors for part 7: Bernard Chauvois (Éducation nationale, France)
                           Fred Bealle (EduCirc, Canada)

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:44 EDT