Re: Proposed Draft UTR #31 - Syntax Characters

From: Peter Kirk (peterkirk@qaya.org)
Date: Fri Aug 22 2003 - 14:47:43 EDT

  • Next message: Peter Kirk: "Re: Proposed Draft UTR #31 - Syntax Characters"

    On 22/08/2003 08:31, Mark Davis wrote:

    >The purpose of the Pattern Syntax characters is *not* to list everything that is
    >a symbol or punctuation mark. That exists independently. Think of them as
    >operators in the engine syntax, as "?" or "*" are used today in Perl, or as
    >+, -, /, * could be used in math expressions.
    >
    >The goal is to have a relatively small, unchangeable list of ranges, which
    >contain a reasonable restriction on characters for future syntax characters in a
    >general pattern environment. General regular expression engines, for example,
    >would *not* add 05C3 HEBREW PUNCTUATION SOF PASUQ as an operator, to indicate
    >(say) a non-greedy match variant of *.
    >
    >Mark
    >__________________________________
    >http://www.macchiato.com
    >► “Eppur si muove” ◄
    >
    >
    >
    Maybe I misunderstood what Marco was talking about. No, I would not
    expect a separate SOF PASUQ operator. My point was more that a Hebrew
    user might acccidentally type or prefer to type SOF PASUQ instead of a
    colon etc.

    I don't think we should be defining as an "unchangeable list" only Latin
    characters for the syntax, thus tying computer languages inseparably to
    the Latin alphabet. That would give the Africans some good reasons to
    complain that Unicode is too American and/or European. It's the
    "unchangeable" which makes me very nervous here. For now all computer
    languages are Latin alphabet based, as far as I know, but who knows what
    will happen in 50-100 years? Computer languages based on Devanagari,
    Arabic or Japanese scripts would be a real possibility (or Cyrillic, but
    the punctuation is the same as Latin).

    Now it would be a different matter if we could somehow reserve other
    punctuation characters for further extension. Then we could allow Latin
    punctuation to be used as operators but require that all other
    punctuation be quoted.

    -- 
    Peter Kirk
    peter@qaya.org (personal)
    peterkirk@qaya.org (work)
    http://www.qaya.org/
    


    This archive was generated by hypermail 2.1.5 : Fri Aug 22 2003 - 15:43:52 EDT