RE: Public Review Issues update: UAX #31

From: Philippe Verdy (
Date: Tue Sep 25 2007 - 16:09:36 CDT

    Rick McGowan wrote:
    > There is also a draft 2 version of the proposed update of UAX#29:Text
    > Boundaries. This update addsCR, LF, Extend, and Control as needed,
    > clarifies use of "Any" , updates MidLetter to include U+2018, andadds a
    > new
    > kind of grapheme cluster: extended combining character sequences. See

    This update removes CR and LF as part of the "Sep" class (in Sentence
    boundaries, i.e. Table 4 for Sentence_Break Property values), but when I
    look at the document, I don't see any place where "Sep" is not accepted
    along with CR and LF, so we see now "Sep | CR |LF" in many "SB*" rules.

    What is the interest of this exclusion? It does not seem to change anything
    to the intended result (and it does not change the existing rules regarding
    the non-breakable sequence CR followed by LF).

    Did you forget something in the document where an accepted instance of "Sep"
    in some rule would not match either CR or LF? Or is it made for clarity (I'm
    not sure that this changes really clarifies anything)?

