From: Ben Dougall (bend@freenet.co.uk)
Date: Thu Aug 21 2003 - 13:28:46 EDT
i'd say wide. narrow means not incorporating some characters that would
naturally fit into 'white space'. if i was parsing some text i'd
consider a non-breaking space white space and i'd expect my code to
reflect that. why would you not want your code to treat a non-breaking
space or mathematical space not as white space?
On Thursday, August 21, 2003, at 04:44 pm, Mark Davis wrote:
> There is one open issue I'd like to draw people's attention to:
> whether to have
> a narrow or broader approach to the whitespace in a pattern
> environment. The
> narrower definition would be:
>
> 0009..000D ; Pattern_White_Space # <CHARACTER TABULATION>..<CARRIAGE
> RETURN
> (CR)>
> 0020 ; Pattern_White_Space # SPACE
> 0085 ; Pattern_White_Space # <NEXT LINE (NEL)>
> 200E..200F ; Pattern_White_Space # LEFT-TO-RIGHT MARK..RIGHT-TO-LEFT
> MARK
> 2028 ; Pattern_White_Space # LINE SEPARATOR
> 2029 ; Pattern_White_Space # PARAGRAPH SEPARATOR
>
> while the broader one would add:
>
> 00A0 ; Pattern_White_Space # NO-BREAK SPACE
> 2000..200A ; Pattern_White_Space # EN QUAD..HAIR SPACE
> 202F ; Pattern_White_Space # NARROW NO-BREAK SPACE
> 205F ; Pattern_White_Space # MEDIUM MATHEMATICAL SPACE
> 3000 ; Pattern_White_Space # IDEOGRAPHIC SPACE
>
> My judgement is that in a pattern environment the narrower devition
> would be
> better. One might go so far as recommending that the others be quoted,
> to reduce
> possible confusion when reading regular expressions, queries, or other
> patterns.
>
> Mark
> __________________________________
> http://www.macchiato.com
> ► “Eppur si muove” ◄
>
> ----- Original Message -----
> From: <Jill.Ramonsky@Aculab.com>
> To: <unicode@unicode.org>
> Sent: Thursday, August 21, 2003 02:44
> Subject: RE: Proposed Draft UTR #31 - Syntax Characters
>
>
>>
>>> This notice is relevant to anyone dealing with programming languages,
>> query
>>> specifications, regular expressions, scripting languages, and similar
>> domains.
>>
>> That's me.
>>
>> I read the draft, and actually I was very happy with it. No
>> complaints at
>> all. I am particularly happy that the mathematical letters and numbers
>> (1D400-1D7FF) will be permitted in identifiers. This is important
>> because it
>> allows mathematical expressions and programming-language expressions
>> to use
>> the same symbols (for the first time!). I also noted the comment
>> about how
>> specific porgramming languages could, if they wished, ignore <font>
>> equivalences (and hence ignore the mathematical letters and numbers)
>> - so I
>> guess that keeps everyone happy.
>>
>> I would have used the feedback form, but I didn't see much point as I
>> had no
>> complaints.
>> Jill
>>
>>
>>
>> -----Original Message-----
>> From: Rick McGowan [mailto:rick@unicode.org]
>> Sent: Wednesday, August 20, 2003 7:23 PM
>> To: unicode@unicode.org
>> Subject: Proposed Draft UTR #31 - Syntax Characters
>>
>>
>> This notice is relevant to anyone dealing with programming languages,
>> query
>> specifications, regular expressions, scripting languages, and similar
>> domains.
>>
>> The Proposed Draft UTR #31: Identifier and Pattern Syntax will be
>> discussed
>> at
>> the UTC meeting next week. Part of that document (Section 4) is a
>> proposal
>> for
>> two new immutable properties, Pattern_White_Space and Pattern_Syntax.
>> As
>> immutable properties, these would not ever change once they are
>> introduced
>> into
>> the standard, so it is important to get feedback on their contents
>> beforehand.
>>
>> The UTC will not be making a final determination on these properties
>> at this
>> meeting, but it is important that any feedback on them is supplied as
>> early
>> in
>> the process as possible so that it can be considered thoroughly. The
>> draft
>> is
>> found at http://www.unicode.org/reports/tr31/ and feedback can be
>> submitted
>> as
>> described there.
>>
>> Regards,
>> Rick McGowan
>> Unicode, Inc.
>>
>>
>
>
This archive was generated by hypermail 2.1.5 : Thu Aug 21 2003 - 14:26:58 EDT