Re: North Indic Fractions etc.

From: Philippe Verdy (
Date: Tue Jun 07 2011 - 00:19:17 CDT

  • Next message: Philippe Verdy: "Re: Slots for Cyrillic Accented Vowels"

    When reexperimenting with the online Unicode Set tool, I just found a
    new bug for (text/word boundaries in property values searched by
    /regexps/). See :


    OK, returns data about all codepoints in a block whose name contain a
    word starting by "latin" (ignoring case). Note that \b (word boundary)
    is honored at the _begining_ of a word (here this returns all Latin


    BAD, should a non-empty subset ; all the above concern blocks whose
    name are TERMINATED by the "latin" word, so this should be equivalent,
    but it is not.


    BAD, the result is empty: I just want a list in all blocks that
    contain the word "latin"

    What am I doing wrong ? I thought it was correct according to the help
    page which explains the supported text/word boundaries (^, $, and \b):


    2011/6/6 fantasai <>:
    > Is there a reason why the North Indic fractions and Aegean numbers and
    > measures
    > are not assigned to any scripts in the ScriptExtensions.txt file?
    > I don't really know what list of scripts they should belong to, but they
    > don't
    > seem very "Common".
    > ~fantasai

    This archive was generated by hypermail 2.1.5 : Tue Jun 07 2011 - 00:25:40 CDT