Re: Small Java implementation of NFC

From: Andrew C. West (andrewcwest@alumni.princeton.edu)
Date: Fri Mar 04 2005 - 08:22:48 CST

  • Next message: Marcin 'Qrczak' Kowalczyk: "Re: Small Java implementation of NFC"

    Elliotte Harold wrote:
    >
    > Are there any decomposable characters beyond the BMP?

    Yes, 13 musical symbols at 1D15E..1D164 and 1D1BB..1D1C0.

    > Or any characters that would need to be recomposed with other characters?

    But there are no characters beyond the BMP that will ever be recomposed using
    NFC.

    Unicode Standard Annex #15 (http://www.unicode.org/reports/tr15/) specifies that
    precomposed characters that are added after Unicode 3.0 are excluded from
    composition (i.e. not recomposed when NFC is applied to them). As all characters
    beyond the BMP were added in Unicode 3.1 or later, you can effectively ignore
    any character greater than U+FFFF (or any surrogates if you are processing
    UTF-16) when applying NFC to a text stream.

    Andrew



    This archive was generated by hypermail 2.1.5 : Fri Mar 04 2005 - 08:24:29 CST