Re: Sample code for NFC and Plane 1 characters

From: Mark Davis (mark.davis@jtcsv.com)
Date: Wed Mar 09 2005 - 12:57:26 CST

  • Next message: Elliotte Harold: "Re: Sample code for NFC and Plane 1 characters"

    On the plate for 4.1 is updating that code. Markus Scherer and Vladimir
    Weinstein provided fixed code, but it slipped through the cracks in the
    previous versions.

    ‚ÄéMark

    ----- Original Message -----
    From: "Elliotte Harold" <elharo@metalab.unc.edu>
    To: "Unicode List" <unicode@unicode.org>
    Sent: Wednesday, March 09, 2005 07:36
    Subject: Sample code for NFC and Plane 1 characters

    > I'm looking at the sample code for performing NFC (specifically
    > recomposition) found at
    > http://www.unicode.org/reports/tr15/Normalizer.html and my initial
    > impression is that this isn't going to work for Unicode 4.0 because it's
    > pretty much ignoring the issues of surrogate pairs. That is, it seems to
    > be operating on Java chars rather than on Unicode code points.
    >
    > Am I missing something? Is this feasible? For instance, if there no
    > characters from beyond the BMP were ever combined with anything in NFC,
    > then one could simply return the surrogate pairs without ever
    > recombining them. However, if there are any recompositions in Plane 1 or
    > 2, then this sample code might need to be updated.
    >
    > Hmm, it looks like the decompose functions in the sample code also only
    > operate on chars, not ints; and it's recently been pointed out here that
    > some characters beyond the BMP do in fact decompose, so it really
    > looks like this code is out of date. Is there any chance it will be
    > updated to properly handle surrogate pairs?
    >
    > --
    > Elliotte Rusty Harold elharo@metalab.unc.edu
    > XML in a Nutshell 3rd Edition Just Published!
    > http://www.cafeconleche.org/books/xian3/
    > http://www.amazon.com/exec/obidos/ISBN=0596007647/cafeaulaitA/ref=nosim
    >
    >
    >



    This archive was generated by hypermail 2.1.5 : Wed Mar 09 2005 - 12:58:39 CST