Re: Line breaking status of emoji modifiers

From: Mark Davis ☕️ <mark_at_macchiato.com>
Date: Sun, 6 Dec 2015 18:25:19 +0100

Yes. This was discussed at the last UTC, and for line break (and other
segmentation, eg #29), there is an action to proposal appropriate rules for
9.0. There are three types of emoji sequences that need to be handled:

   - flag sequences
   - modifier sequences
   - zwj sequences

In the meantime, people are customizing their implementations to deal with
the emoji sequences. For now, it may be simpler for some to just use the
complete list of current sequences as exceptions, and disallow breaking
within them.

Mark

On Sun, Dec 6, 2015 at 1:08 AM, Simon Cozens <simon_at_simon-cozens.org> wrote:

> My renderer just got hit with an interesting, if possibly obscure, bug.
>
> UTR#51 says "A supported emoji modifier sequence should be treated as a
> single grapheme cluster for editing purposes (cursor moment, deletion,
> etc.); word break, line break, etc." However, the modifier codepoints
> have line break category AL.
>
> So you have an emoji (line break ID) and its modifier (line break AL),
> and ICU (quite correctly) inserts a line break opportunity between the
> two. This split the cluster, and then everything went downhill after that.
>
> If you don't expect a line break here, shouldn't they be better as CM
> for line breaking purposes rather than AL?
>
Received on Sun Dec 06 2015 - 11:26:46 CST

This archive was generated by hypermail 2.2.0 : Sun Dec 06 2015 - 11:26:46 CST