From: Arcane Jill (email@example.com)
Date: Fri Mar 04 2005 - 09:39:21 CST
But you do have to DEcompose them, right? That is, NFC(x) is the same as
NFD(x), if x was added after Unicode 3.0. I mean, you can't just ignore them
Is that right?
From: firstname.lastname@example.org [mailto:email@example.com]On
Behalf Of Andrew C. West
Sent: 04 March 2005 14:23
Subject: Re: Small Java implementation of NFC
Elliotte Harold wrote:
> Are there any decomposable characters beyond the BMP?
Yes, 13 musical symbols at 1D15E..1D164 and 1D1BB..1D1C0.
> Or any characters that would need to be recomposed with other characters?
But there are no characters beyond the BMP that will ever be recomposed using
Unicode Standard Annex #15 (http://www.unicode.org/reports/tr15/) specifies
precomposed characters that are added after Unicode 3.0 are excluded from
composition (i.e. not recomposed when NFC is applied to them). As all
beyond the BMP were added in Unicode 3.1 or later, you can effectively ignore
any character greater than U+FFFF (or any surrogates if you are processing
UTF-16) when applying NFC to a text stream.
This archive was generated by hypermail 2.1.5 : Fri Mar 04 2005 - 11:14:38 CST