Re: Question about formatting numerals

From: Jukka K. Korpela (
Date: Thu Sep 21 2006 - 01:20:50 CDT

  • Next message: Eric Muller: "Re: Unicode & space in programming & l10n"

    On Wed, 20 Sep 2006, Addison Phillips wrote:

    > Locales that use spaces in digit groups generally use the regular
    > non-breaking space character (U+00A0).

    That's what there is in the CLDR data
    ( )
    but I'm pretty sure that actual data almost universally contains just
    normal spaces. Non-breakability and the amount of spacing are handled at
    the styling and formatting level, if at all. This may slowly change in
    computer-generated texts, as the utilization of CLDR grows.

    > Less common spaces I would avoid: they
    > may not translate well to legacy encodings or might not have glyphs available
    > in specific fonts.

    I wouldn't be so worried about conversions to legacy encodings when using
    Unicode for new data. The other concerns are important. Fixed-width spaces
    are generally very poorly supported in fonts, though the thin space might
    be adequate in special cases (where you _know_ that the font in use
    contains that character in a suitable appearance or the rendering engine
    handles the thin space in a suitable way _and_ you can prevent line breaks
    by some means).

    The ideal character for plain text _would_ be a thin no-break space, but
    it is neither widely supported nor sufficiently well defined. The Unicode
    Standard does not describe its intended meaning and usage sufficiently
    well, I would say - on the surface at least, it appears to be included for
    specific use in a particular language.

    > U+00A0 is generally available in most encodings and fonts and has the desired
    > effect. Whether it is proportional or not depends, in large part, on the font
    > used.

    Typically, the no-break space has the same width as the space and it
    preserves that width in formatting processes where the space may expand or
    shrink (especially when producing text that is justified on both sides).
    This may make it unsuitable for typographic reasons.

    Jukka "Yucca" Korpela,

    This archive was generated by hypermail 2.1.5 : Thu Sep 21 2006 - 01:23:13 CDT