From: Peter Kirk (email@example.com)
Date: Fri Aug 22 2003 - 14:47:43 EDT
On 22/08/2003 08:31, Mark Davis wrote:
>The purpose of the Pattern Syntax characters is *not* to list everything that is
>a symbol or punctuation mark. That exists independently. Think of them as
>operators in the engine syntax, as "?" or "*" are used today in Perl, or as
>+, -, /, * could be used in math expressions.
>The goal is to have a relatively small, unchangeable list of ranges, which
>contain a reasonable restriction on characters for future syntax characters in a
>general pattern environment. General regular expression engines, for example,
>would *not* add 05C3 HEBREW PUNCTUATION SOF PASUQ as an operator, to indicate
>(say) a non-greedy match variant of *.
>► “Eppur si muove” ◄
Maybe I misunderstood what Marco was talking about. No, I would not
expect a separate SOF PASUQ operator. My point was more that a Hebrew
user might acccidentally type or prefer to type SOF PASUQ instead of a
I don't think we should be defining as an "unchangeable list" only Latin
characters for the syntax, thus tying computer languages inseparably to
the Latin alphabet. That would give the Africans some good reasons to
complain that Unicode is too American and/or European. It's the
"unchangeable" which makes me very nervous here. For now all computer
languages are Latin alphabet based, as far as I know, but who knows what
will happen in 50-100 years? Computer languages based on Devanagari,
Arabic or Japanese scripts would be a real possibility (or Cyrillic, but
the punctuation is the same as Latin).
Now it would be a different matter if we could somehow reserve other
punctuation characters for further extension. Then we could allow Latin
punctuation to be used as operators but require that all other
punctuation be quoted.
-- Peter Kirk firstname.lastname@example.org (personal) email@example.com (work) http://www.qaya.org/
This archive was generated by hypermail 2.1.5 : Fri Aug 22 2003 - 15:43:52 EDT