Re: Surrogate pairs and UTF-8

From: Mike Ayers (
Date: Wed Jun 21 2006 - 13:31:22 CDT

    Pavils Jurjans wrote:

    > - The guides on <> site talk only about
    > surrogate pair and UTF-16 conversion. How about the UTF-8?

            Surrogates do not exist in UTF-8. They are the mechanism by which
    UCS-2 (which encodes 16 bits) was simultaneously restricted and extend
    to become UTF-16 (which encodes 21 bits). Surrogates are not
    characters. They are UTF-16 code points only.



