UAX 29

From: Daniel Ehrenberg (microdan@gmail.com)
Date: Wed Aug 22 2007 - 11:21:27 CDT

  • Next message: Mark Davis: "Re: UAX 29"

    Hi,

    I'm reading UAX 29 in order to implement grapheme boundaries (and
    later word and sentence boundaries) for a Unicode library for the
    Factor programming language. So far, for grapheme boundary detection,
    I have a basically direct implementation of the conditions listed for
    boundaries, where I iterate through the string, checking each
    connectedness condition, and if they all fail, returning a grapheme
    break. This implementation works, but I'm wondering about a
    table-based implementation, which could be faster and allow tailoring
    (my implementation doesn't really allow that, except for rewriting
    it). The UAX frequently references table-based implementations, but it
    never describes what they are exactly or how I might go about
    implementing them. I tried finding the code in ICU for it, but I'm
    somewhat new at C++ and could not locate where the tables were
    generated.

    If someone could help me in this, that would be great.

    Daniel Ehrenberg



    This archive was generated by hypermail 2.1.5 : Wed Aug 22 2007 - 11:24:59 CDT