Re: Proposed Draft UTR #31 - Syntax Characters

From: Mark Davis (mark.davis@jtcsv.com)
Date: Thu Aug 21 2003 - 11:44:09 EDT

  • Next message: Mark Davis: "Re: Proposed Draft UTR #31 - Syntax Characters"

    There is one open issue I'd like to draw people's attention to: whether to have
    a narrow or broader approach to the whitespace in a pattern environment. The
    narrower definition would be:

    0009..000D ; Pattern_White_Space # <CHARACTER TABULATION>..<CARRIAGE RETURN
    (CR)>
    0020 ; Pattern_White_Space # SPACE
    0085 ; Pattern_White_Space # <NEXT LINE (NEL)>
    200E..200F ; Pattern_White_Space # LEFT-TO-RIGHT MARK..RIGHT-TO-LEFT MARK
    2028 ; Pattern_White_Space # LINE SEPARATOR
    2029 ; Pattern_White_Space # PARAGRAPH SEPARATOR

    while the broader one would add:

    00A0 ; Pattern_White_Space # NO-BREAK SPACE
    2000..200A ; Pattern_White_Space # EN QUAD..HAIR SPACE
    202F ; Pattern_White_Space # NARROW NO-BREAK SPACE
    205F ; Pattern_White_Space # MEDIUM MATHEMATICAL SPACE
    3000 ; Pattern_White_Space # IDEOGRAPHIC SPACE

    My judgement is that in a pattern environment the narrower devition would be
    better. One might go so far as recommending that the others be quoted, to reduce
    possible confusion when reading regular expressions, queries, or other patterns.

    Mark
    __________________________________
    http://www.macchiato.com
    ► “Eppur si muove” ◄

    ----- Original Message -----
    From: <Jill.Ramonsky@Aculab.com>
    To: <unicode@unicode.org>
    Sent: Thursday, August 21, 2003 02:44
    Subject: RE: Proposed Draft UTR #31 - Syntax Characters

    >
    > > This notice is relevant to anyone dealing with programming languages,
    > query
    > > specifications, regular expressions, scripting languages, and similar
    > domains.
    >
    > That's me.
    >
    > I read the draft, and actually I was very happy with it. No complaints at
    > all. I am particularly happy that the mathematical letters and numbers
    > (1D400-1D7FF) will be permitted in identifiers. This is important because it
    > allows mathematical expressions and programming-language expressions to use
    > the same symbols (for the first time!). I also noted the comment about how
    > specific porgramming languages could, if they wished, ignore <font>
    > equivalences (and hence ignore the mathematical letters and numbers) - so I
    > guess that keeps everyone happy.
    >
    > I would have used the feedback form, but I didn't see much point as I had no
    > complaints.
    > Jill
    >
    >
    >
    > -----Original Message-----
    > From: Rick McGowan [mailto:rick@unicode.org]
    > Sent: Wednesday, August 20, 2003 7:23 PM
    > To: unicode@unicode.org
    > Subject: Proposed Draft UTR #31 - Syntax Characters
    >
    >
    > This notice is relevant to anyone dealing with programming languages, query
    > specifications, regular expressions, scripting languages, and similar
    > domains.
    >
    > The Proposed Draft UTR #31: Identifier and Pattern Syntax will be discussed
    > at
    > the UTC meeting next week. Part of that document (Section 4) is a proposal
    > for
    > two new immutable properties, Pattern_White_Space and Pattern_Syntax. As
    > immutable properties, these would not ever change once they are introduced
    > into
    > the standard, so it is important to get feedback on their contents
    > beforehand.
    >
    > The UTC will not be making a final determination on these properties at this
    > meeting, but it is important that any feedback on them is supplied as early
    > in
    > the process as possible so that it can be considered thoroughly. The draft
    > is
    > found at http://www.unicode.org/reports/tr31/ and feedback can be submitted
    > as
    > described there.
    >
    > Regards,
    > Rick McGowan
    > Unicode, Inc.
    >
    >



    This archive was generated by hypermail 2.1.5 : Thu Aug 21 2003 - 12:39:16 EDT