Re: UTF-16 inside UTF-8

From: David E. Hollingsworth (
Date: Tue Nov 04 2003 - 09:51:59 EST

  • Next message: Philippe Verdy: "[hebrew] Re: variation selectors for combining characters (was: Hebrew composition model, with cantillation marks)"

    I believe this is described pretty well in sections 3.8 & 3.9 (plus
    conformance requirement C12b) of Unicode 4.0.

    Surrogate pairs are for UTF-16 only. For UTF-8 & UTF-32, surrogates
    (pairs or otherwise) are ill-formed code unit sequences, and
    conformant processes must treat them as erroneous.


    This archive was generated by hypermail 2.1.5 : Tue Nov 04 2003 - 10:58:57 EST