What should happen with "\N{LATIN SMALL LIGATURE IJ}" =~ /(i)(j)/i

From: karl williamson (public@khwilliamson.com)
Date: Sun Sep 05 2010 - 16:16:53 CDT

  • Next message: Janusz S. Bień: "Re: ,,semi-virgula''"

    "\N{LATIN SMALL LIGATURE IJ}" =~ /ij/i

    matches, as the fold of the ligature is 'ij'. But if you simply add
    capturing parentheses, as in this post's subject line, it becomes
    somewhat nonsensical, as each captured group should match some part of
    the indivisible character LATIN SMALL LIGATURE IJ. And the problem is
    not restricted to ligatures, but when comparing using the NFD
    normalizations of a string and pattern, the captured portion matched may
    not be in the original string.

    I didn't see any reference to this in the ICU documentation.



    This archive was generated by hypermail 2.1.5 : Sun Sep 05 2010 - 16:25:10 CDT