Re: UAX 29 9.0.0 new emoji flag rules questions and comments

From: Daniel Bünzli <daniel.buenzli_at_erratique.ch>
Date: Wed, 22 Jun 2016 12:10:30 +0100

Le mercredi, 22 juin 2016 à 01:32, Laurentiu Iancu a écrit :
> Re #1, the ^ symbol indeed denotes a start-of-line anchor, in usual regex notation, and the corresponding rules could use sot instead.

By the way it seems to me that an equivalent formulation of GB12/GB13 and WB15/WB16 would be to have the sequence of rules:

RI RI ÷ RI RI
RI x RI

This fits particularly well in the case of word breaking since you already need as much context as this because of the rules WB{6,7,11,12}. It also avoids regexps and negation.

Best,

Daniel
Received on Wed Jun 22 2016 - 06:11:01 CDT

This archive was generated by hypermail 2.2.0 : Wed Jun 22 2016 - 06:11:01 CDT