Re: Invalid code points

From: Doug Ewell (
Date: Sun May 31 2009 - 17:25:10 CDT

  • Next message: Ruszlán Gaszanov: "Re: Invalid code points"

    Hans Aberg <haberg at math dot su dot se> wrote:

    > I think also strictly speaking there are two UTF-8s: one which does
    > not have the integer limitations that are used in Unicode. This could
    > be used to convert integers sequences into byte sequences which then
    > do not have Unicode character interpretation.

    There is only one UTF-8, the one defined by Unicode and ISO/IEC 10646,
    which maps valid Unicode/10646 scalar values to sequences of bytes.
    Anything else is not UTF-8. Keep repeating this to yourself.

    Doug Ewell  *  Thornton, Colorado, USA  *  RFC 4645  *  UTN #14  ˆ

    This archive was generated by hypermail 2.1.5 : Sun May 31 2009 - 17:27:49 CDT