RE: Specification for XID_Start and XID_Continue

From: Philippe Verdy (
Date: Tue Aug 14 2007 - 20:16:33 CDT

  • Next message: Mark Davis: "Re: Specification for XID_Start and XID_Continue"

    Not really: a switch statement whose cases are very distant, is also
    performed at runtime using a binary lookup with a sorted list of values
    generated by the compiler.

    Although this lookup is very optimized, you can often perform better by
    using your own binary lookup. If the compiler generates a series of
    compare-and-branch-on-equal, it is a linear lookup (in that case, most of
    the time, no branch is taken for the cases, but the code falls through the
    various compares and conditional branches. The situation depends on the
    number of cases in the switch, as many compare-and-branch-on-equal will be
    less efficient than the binary lookup (the compiler decides).

    In all cases, the effect of implementing a binary lookup coded andoptimized
    in your code wil have an insignificant impact on performance (notably if the
    binary lookup function is inlined (some compilers will not inline functions
    with multiple returns, but will still be able to generate a local function
    with short calls).

    A code filled with hardcoded switches that depend on external conditions
    that can change later at any time is something to consider : thiscode will
    not adapteasily to those changes, and the only impact is on the single
    initialization to load the external data, and this impact becomes nearly
    void when the isXXX() function is called many times using the same
    preinitialized lookup table.

    For this reason, I avoid switch statements in my code, if they highly depend
    on specifications that are not part of the code design itself because this
    code implicitly makes assumptions that may not resit to time (and this is
    the case here, as your code should adapt easily to changes in Unicode
    versions without having to review every place where assumptions have become
    wrong). If you want to make switches, you have to document somewhere that
    your implementation may not work in future versions of Unicode, and after
    each version, you'll have to review the new Unicode data to see where it may
    impact your code.

    > -----Message d'origine-----
    > De : [] De la
    > part de Mike
    > Envoyé : mercredi 15 août 2007 01:09
    > À :
    > Objet : Re: Specification for XID_Start and XID_Continue
    > > My opinion is that your code should contain a function to load an
    > external
    > > resource at init time, and then the isXidStart() function will use the
    > > content of the set loaded before during init.
    > In general I would agree, but a simple switch statement should
    > be faster. Also I support Unicode versions 3.2, 4.0, 4.1, and
    > 5.0, and these functions are not dependent on version. If in
    > version 5.1 or later, the list of code points changes, I will
    > probably do something different.
    > Mike

    This archive was generated by hypermail 2.1.5 : Tue Aug 14 2007 - 20:19:32 CDT