    >Arcane Jill wrote:
    >> # for all possible octet sequences s:
    >> # length of (UTF-8(f(s)) <= length of s,

    >No, that is not the requirement. It is:
    >bytelength(f(s)) <= 2*bytelength(s)

    You haven't understood. By definition, s is an octet stream, and f(s) is a
    Unicode character stream - and therefore "bytelength(f(s))" is completely
    meaningless. You cannot take the byte-length of a Unicode character or a
    Unicode character stream. "bytelength(UTF-8(f(s))", on the other hand, does
    make sense.

    And I say again, your own solution, in which (for example) 0x9F maps to
    U+EE9F, does not meet the requirement, since UTF-8(U+EE9F) is { EE BA 9F },
    the byte-length of which is > 2 * 1.

    What was wrong with my suggestion which would have mapped 0x9F to { U+0002
    U+001F }, by the way? This actually /does/ meet your new requirement.


