Public Review Issues

306	Proposed Update UAX #29, Unicode Text Segmentation	Closing Date: 2016.05.02
Status:	Closed
Originator:	UTC
Informal Discussion:	Unicode Mail List (Join)
Formal Feedback:	Contact Form
Resolution:	The UAX will be updated with final content and published as part of Unicode 9.0.

Description of Issue:

This Unicode Standard Annex will be updated for Unicode 9.0. A draft of the proposed update is available for general public review and comment. Draft updated 2016-04-19.

New classes and rules have been added for grapheme cluster breaking and word breaking, to keep breaks from occurring inappropriately inside various types of emoji sequences.

In this revision, the Word_Break classification of U+202F NARROW NO-BREAK SPACE (NNBSP) is modified to correct the text segmentation behavior of U+202F for Mongolian usage. For further background on this issue and possible ways to address it, see PRI #308, Property Change for U+202F NARROW NO-BREAK SPACE (NNBSP).

Also, the formerly empty Prepend class of the Grapheme_Cluster_Break property is redefined to consist of all prefixed format control characters and a few other characters with certain Indic_Syllabic_Category property values.

The corresponding property value changes will be incorporated in the UCD data files for Unicode 9.0.

For other changes—and more detail—see Modifications.

How to Provide Feedback: For information about how to discuss this Public Review Issue and how to supply formal feedback, please see the feedback and discussion instructions. The accumulated feedback received so far on this issue is shown below, or you can look at a full page view. Feedback is reviewed by the relevant committee according to their meeting schedule.