RE: Roundtripping Solved

From: Arcane Jill (arcanejill@ramonsky.com)
Date: Thu Dec 16 2004 - 07:28:53 CST

  • Next message: Lars Kristan: "RE: Roundtripping Solved"

    >Arcane Jill wrote:
    >> # for all possible octet sequences s:
    >> # length of (UTF-8(f(s)) <= length of s,

    >No, that is not the requirement. It is:
    >bytelength(f(s)) <= 2*bytelength(s)

    You haven't understood. By definition, s is an octet stream, and f(s) is a
    Unicode character stream - and therefore "bytelength(f(s))" is completely
    meaningless. You cannot take the byte-length of a Unicode character or a
    Unicode character stream. "bytelength(UTF-8(f(s))", on the other hand, does
    make sense.

    And I say again, your own solution, in which (for example) 0x9F maps to
    U+EE9F, does not meet the requirement, since UTF-8(U+EE9F) is { EE BA 9F },
    the byte-length of which is > 2 * 1.

    What was wrong with my suggestion which would have mapped 0x9F to { U+0002
    U+001F }, by the way? This actually /does/ meet your new requirement.

    Jill

    -----Original Message-----
    From: Lars Kristan [mailto:lars.kristan@hermes.si]
    Sent: 16 December 2004 11:54
    To: 'Arcane Jill'; Unicode
    Subject: RE: Roundtripping Solved



    This archive was generated by hypermail 2.1.5 : Thu Dec 16 2004 - 07:35:20 CST