RE: Proposed Draft UTR #31 - Syntax Characters

Date: Tue Aug 26 2003 - 03:07:40 EDT

  • Next message: Kent Karlsson: "RE: UTS #10 : comment on Hangul Jamo(Letter) collation"

    I'm afraid that's not very practical, because, you see, if I have a
    hypothetical compiler for some hypothetical programming-language, and I
    download some source-code from the internet and try to complile it, I expect
    one of two things, either (1) it will compile cleanly, or (2) I will have to
    UPGRADE my compiler (or version of Unicode), after which it will compile

    I don't expect, however, to have to DOWNgrade my version of Unicode. And I
    can't be expected to store EVERY numbered version of Unicode on my machine.

    I prefer the idea that the list of allowed identifier characters increases
    with each version of Unicode (or equivalently, that a list of excluded
    characters decreases with each version of Unicode).

    Sure - some mischevious types could write deliberately obfuscated code, but
    I think that's irrelevant to us. (They can do that NOW. There are even
    competitions for it). You only really need to consider ACCIDENTAL mistypes.


    Visual lookalikes are not NECESSARILY a problem, with a smart syntax engine.
    I think it would be pretty useful to have variable names like "my-function"
    (with a hyphen). A smart enough engine could transform the HYPHEN-MINUS into
    either HYPHEN or MINUS as appropriate. A text editor would probably render
    them in different colors anyway (one color for identifiers, another color
    for operators) so there wouldn't necessarily be any confusion.

    (Current C++ compilers do a similar thing today. A template class like
    "A<B<C> >" needs that space, otherwise the ">>" would be interpreted as
    "operator >>". Perhaps even more closely related, COBOL compilers allow
    hyphens in identifier names, AND as a minus sign. Again, you have to use
    spaces to distinguish the two uses).


    -----Original Message-----
    From: Peter Kirk []
    Sent: Monday, August 25, 2003 3:14 PM
    To: Marco Cimarosti
    Cc: ''
    Subject: Re: Proposed Draft UTR #31 - Syntax Characters

    The way round this is to define syntax relative to a
    specific version of Unicode.

    This archive was generated by hypermail 2.1.5 : Tue Aug 26 2003 - 04:04:32 EDT