Re: Proposing UTF-21/24

From: Philippe Verdy (
Date: Wed Jan 24 2007 - 05:00:26 CST

  • Next message: Philippe Verdy: "Re: Proposing UTF-21/24"

    From: "Ruszlan Gaszanov" <>
    >> I implemented a BOCU-1 encoder/decoder in about 400 lines of C++,
    >> so I wouldn't call it too complex.
    > Complexity is a relative concept.
    > For comparison:
    > - UTF-8 encoder+decoder - under 100 lines (100 / 400 = 25%)
    > - UTF-16 encoder+decoder - under 40 lines (40 / 400 = 10%)
    > - UTF-24 encoder+decoder - under 20 lines (20 / 400 = 5%)
    > - UTF-21 encoder+decoder - exactly 2 lines (2 / 400 = 0.5%)

    Your stats are clearly flawed. This means that you make a UTF-21 encoder as a single non commented source line for the whole function completley inlined with its tests.

    This is unfair. My UTF-16 encoder or decoder just needs 1 *simpler* line to implement than your UTF-21, using a single test (which ius really easy to inline if needed), no intermediate assignement.

    The normal fair complexity must be compared using comparable programming styles. If one wants trustable metrics, then counting lines is not accurate. You must count: the number of tests to perform, the number of basic arithmetic operations, the number of temporary variables for intermediatelookups, and the number of variable assignments. Ignore the comment lines in such metrics, because they can be expanded or completely removed at will!

    Also ignore the length of the function declaration because this is not language-neutral (just count 1 declaration for a needed function). And ignore the programming code style (position of line breaks, empty lines, blanks for indentation, length of variable names...). A reasonnably fair metric would be based on counting nodes in a syntax parsing tree.

    This archive was generated by hypermail 2.1.5 : Wed Jan 24 2007 - 05:02:15 CST