Re: Question about formatting numerals

From: Jukka K. Korpela (jkorpela@cs.tut.fi)
Date: Thu Sep 21 2006 - 13:04:12 CDT

Next message: Richard Wordingham: "Re: Unicode & space in programming & l10n"

Previous message: Rakesh Sharma: "Problem facing while dealing with full width alpha numeric characters"
In reply to: Addison Phillips: "Re: Question about formatting numerals"
Next in thread: Philippe Verdy: "Re: Question about formatting numerals"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

On Thu, 21 Sep 2006, Addison Phillips wrote:

> Jukka K. Korpela wrote:
- -
>> but I'm pretty sure that actual data almost universally contains just
>> normal spaces.
>
> That's probably not true. User input may be "regular spaces", but I think
> you'll find that computer systems generate non-breaking spaces.

Some systems may, but I don't think that's common at all. Think about all
the texts written using text editors or word processors, by people who
rarely even know about the no-break space, still less use it regularly.
Their programs hardly convert spaces to no-break spaces. Numeric data
written in text format by programs tends to use I/O routines that use no
thousands separator, though they might sometimes use a period or a comma
or even a space. But hardly a no-break space.

> However, here we are dealing with a
> recommendation to content authors. For a number, using a non-breaking space
> will prevent things like line-breaking from interfering with text legibility.

It will, but especially in justified text, it has a price. Besides, for a
number, it would be rather trivial for a rendering engine to avoid (by
default) a line break between sequences of digits even when they are
separated by a space. (Actually, should this be taken into account in
Unicode line breaking rules, by adding NU SP* × NU or at least NU SP × NU
there? Just a thought.)

>> I wouldn't be so worried about conversions to legacy encodings when using
>> Unicode for new data.
>
> I would, simply because users will wish to utilize text in many places that
> use legacy encodings. It is bad to have your number suddenly and inexplicably
> become "123?445?789".

You have a very good point here, but I don't think it's about legacy
encodings. Rather, it's about more limited character repertoires and about
legacy software. If you cut and paste numbers from, say, a text document
into a spreadsheet program, you may find out that fixed-width spaces won't
be recognized as spaces at all - even if no encoding problems are
involved. But on similar grounds, you may run into problems with no-break
spaces, too. Legacy software with simple ASCII-oriented input routines may
get wild when it sees a no-break space.

-- 
Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/

Next message: Richard Wordingham: "Re: Unicode & space in programming & l10n"
Previous message: Rakesh Sharma: "Problem facing while dealing with full width alpha numeric characters"
In reply to: Addison Phillips: "Re: Question about formatting numerals"
Next in thread: Philippe Verdy: "Re: Question about formatting numerals"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Thu Sep 21 2006 - 13:10:40 CDT