From: Marcin 'Qrczak' Kowalczyk (qrczak@knm.org.pl)
Date: Wed Dec 08 2004 - 16:01:36 CST
Lars Kristan <lars.kristan@hermes.si> writes:
> Quite close. Except for the fact that:
> * U+EE93 is represented in UTF-32 as 0x0000EE93
> * U+EE93 is represented in UTF-16 as 0xEE93
> * U+EE93 is represented in UTF-8 as 0x93 (_NOT_ 0xEE 0xBA 0x93)
Then it would be impossible to represent sequences like
U+EEEE U+EEBA U+EE93 in UTF-8, and conversion UTF-32 -> UTF-8 -> UTF-32
would not round-trip.
Concatenation of UTF-8-encoded strings would not be equivalent to
UTF-8-encoding of the concatenation of code points.
This is broken.
-- __("< Marcin Kowalczyk \__/ qrczak@knm.org.pl ^^ http://qrnik.knm.org.pl/~qrczak/
This archive was generated by hypermail 2.1.5 : Wed Dec 08 2004 - 16:02:22 CST