Re: Proposed Draft UTR #31 - Syntax Characters

From: Peter Kirk (peterkirk@qaya.org)
Date: Thu Aug 21 2003 - 18:01:11 EDT

  • Next message: Mark Davis: "Re: [Way OT] Beer measurements"

    On 21/08/2003 13:26, Jim Allan wrote:

    > Traditionally in c NBSP was not counted as white space. See
    > http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vccelng/htm/eleme_2.asp
    > for one reference.
    >
    > This may have been accidental, as c white space properties were
    > defined with only the 7-bit ASCII character set in mind.
    >
    > But it would break current c programs if NBSP were defined as white
    > space. Logically then, if we exclude NBSP, other "hard" spaces should
    > also not be defined as white space.
    >
    > Essentially NBSP was treated by many word processors and text editors
    > as simply a printing character, like any other printing character,
    > with no special "spacing" properties. It was only an imitation of a
    > space in appearance. Undefined characters in fonts might also appear
    > as imitiations of space in many printing systems. That did not make
    > them white space.
    >
    > Of course under Unicode specifications NBSP is expect to expand like
    > SPACE for justification and so assumes some of the attributes of SPACE.
    >
    > For compatility I think it best to not include any of the non-breaking
    > spaces as white space.
    >
    > Jim Allan
    >
    Not counting NBSP as whitespace may make it easier to include spacing
    diacritics in patterns, if NBSP rather than space is used to to carry them.

    -- 
    Peter Kirk
    peter@qaya.org (personal)
    peterkirk@qaya.org (work)
    http://www.qaya.org/
    


    This archive was generated by hypermail 2.1.5 : Thu Aug 21 2003 - 18:58:23 EDT