RE: Surrogate pairs and UTF-8

From: Peter Constable (petercon@microsoft.com)
Date: Fri Jun 23 2006 - 17:21:26 CDT

  • Next message: Tommy Nordgren: "References on Perl & Unicode"

    > From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org]
    On Behalf
    > Of Mike Ayers

    > Surrogates do not exist in UTF-8.

    One might say that surrogate pairs (and triples and quadruples) existing
    in UTF-8, though of course Surrogate Pairs -- in the UTF-16-specific
    sense -- do not. UTF-16 Surrogate Pairs are basically doing the same
    thing that multi-byte sequences in UTF-8 do: provide a coding mechanism
    to represent a larger range of code points than could nominally be
    represented by the code units of the given encoding form directly (256
    for 8-bit code units and 65536 for 16-bit code units). They mainly
    differ only in details.

    Peter Constable



    This archive was generated by hypermail 2.1.5 : Fri Jun 23 2006 - 17:29:07 CDT