unicode@Unicode.ORG writes:
> Pls note that UTF-8 is a 31-bit standard (not just a 24-bit standard),
> so it offers a variable-length byte encoding of all 10646 characters let
> alone all Unicode characters.
Well, to be more exact, UTF-8 is a transformation format that
can encode all of ISO/IEC 10646 characters, ISO/IEC 10646 can
encode 2 Gb characters (2**31 characters). UTF-8 will transform
a 10646 character to a sequence of between 1 and *6* octets.
See http://www.dkuug.dk/JTC1/SC2/WG2/docs/N1335 for further reference.
keld
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:31 EDT