From: Arcane Jill (arcanejill@ramonsky.com)
Date: Fri Mar 04 2005 - 09:39:21 CST
But you do have to DEcompose them, right? That is, NFC(x) is the same as
NFD(x), if x was added after Unicode 3.0. I mean, you can't just ignore them
altogether.
Is that right?
Jill
-----Original Message-----
From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org]On
Behalf Of Andrew C. West
Sent: 04 March 2005 14:23
To: unicode@unicode.org
Cc: elharo@metalab.unc.edu
Subject: Re: Small Java implementation of NFC
Elliotte Harold wrote:
>
> Are there any decomposable characters beyond the BMP?
Yes, 13 musical symbols at 1D15E..1D164 and 1D1BB..1D1C0.
> Or any characters that would need to be recomposed with other characters?
But there are no characters beyond the BMP that will ever be recomposed using
NFC.
Unicode Standard Annex #15 (http://www.unicode.org/reports/tr15/) specifies
that
precomposed characters that are added after Unicode 3.0 are excluded from
composition (i.e. not recomposed when NFC is applied to them). As all
characters
beyond the BMP were added in Unicode 3.1 or later, you can effectively ignore
any character greater than U+FFFF (or any surrogates if you are processing
UTF-16) when applying NFC to a text stream.
Andrew
This archive was generated by hypermail 2.1.5 : Fri Mar 04 2005 - 11:14:38 CST