Public Review Issues

Accumulated Feedback on PRI #319

This page is a compilation of formal public feedback received so far. See Feedback for further information on this issue, how to discuss it, and how to provide feedback.

Date/Time: Wed Feb 10 20:40:45 CST 2016
Name: Roozbeh Pournader
Report Type: Error Report
Opt Subject: emoji zwj sequence in UTR #51 is too broad

Trying to implement logic for implementing emoji ZWJ sequences in Android, we
noticed that the syntax for emoji ZWJ sequences is unnecessarily broad. For
example, it allows flags to be joined to a keycap sequence, which doesn't seem
to serve any purpose.

This creates a lot of implementation difficulties by requiring to parse very
complex expressions as potentially just one glyph. These complex expresssions
don't serve any purpose at the moment, slowing down processes unnecessarily
and creating potential attack vectors in text stacks.

Since in practice the emoji ZWJ sequences are limited to base emojis plus
potential variation sequences, we think the current definition ED-15 in UTR
#51 should change from:

emoji_core_sequence ( ZWJ emoji_core_sequence )+

to:

(emoji_character | emoji_variation_sequence)  ZWJ emoji_character |
(emoji_variation_sequence) )+

If more complex emoji ZWJ sequences are added by implementations in the
future, the expression can be expanded, of course. But by looking at
http://www.unicode.org/L2/L2016/16011r3-break-prop-emoji.pdf and based on
discussions with UTC members it seems that such complex cases would be very
unlikely in widely used implementations.

Date/Time: Wed Mar 16 15:45:45 CDT 2016
Name: Roozbeh Pournader
Report Type: Error Report
Opt Subject: emoji-data.txt has spaces at the end of some lines

As of today, http://www.unicode.org/Public/emoji/3.0/emoji-data.txt has 
unnecessary extra spaces at the end of some lines. See for example 
lines 13, 14, 18, 234, ...

Date/Time: Tue Mar 22 12:34:38 CDT 2016
Name: Addison Phillips
Report Type: Public Review Issue
Opt Subject: Use of script subtags to specify emoji presentation is a bad idea

http://www.unicode.org/review/pri319/

In the proposed update to TR51, Section 4.2 introduces something called "Emoji
Script". These are private use script subtags for BCP47 language tags,
apparently introduced in CLDRv29. The idea is that language tags can be used
to somehow "control the presentation" of emoji between the text and emoji
forms.

I think this is a horrible idea. The script subtag is meant to indicate script
variation in a language. Some languages (Chinese, Serbian, Azerbaijani, etc.)
need or require this subtag to indicate language variation. In other cases,
the use of the script subtag to control the emoji presentation will interfere
with matching or selection mechanisms already in place. Many of these are
naive substring matching implementations. That is :lang(en-US) in CSS does not
match <p lang="en-Zsye-US"> in HTML because the subtag for "emoji
presentation" has been inserted into the middle of the language tag.

It is also difficult for rendering systems to deal with interstitial subtags
like this, since they must parse the language tag to determine if a script
subtag that applies to emoji exists in the overall tag. Font fallback systems
may look at language tags for other reasons, but mainly they do so to
establish hierarchical fallback regimes or to access OpenType font features.
This seems like a different and somewhat orthogonal usage (the font selection
or feature selection still has to occur around the emoji variation).

Since the "emoji script" only has bearing on the emoji and has nothing to do
with the language of documents or resources, inserting it into the middle of
tags in this fashion will have breaking consequences. In my opinion, it would
be better to propose variant registration on ietf-languages or to use the
other mechanism proposed in TR51 exclusively.

Date/Time: Thu Mar 24 16:08:12 CDT 2016
Name: Addison Phillips (chair)
Report Type: Public Review Issue
Opt Subject: W3C I18N WG comment on PRI319 [I18N-ACTION-501]

The W3C I18N WG discussed the proposed update to TR51 as part of our teleconference 
of 24 March [https://www.w3.org/2016/03/24-i18n-minutes.html]. During our discussion, 
the WG endorsed my previous personal comment that the use of script subtags to 
identify emoji vs. text presentation of emoji characters was inappropriate for 
the reasons given.

We also noted that there will be four different mechanisms (VS, CSS, script subtag, 
and locale extension) for indicating emoji presentation, including two in language 
tags, which may provide for confused presentational behavior.

Date/Time: Mon May 2 01:50:21 CDT 2016
Name: Peter Edberg
Report Type: Public Review Issue
Opt Subject: PRI 319 feedback: remove two 9.0 emoji

Apple proposes that U+1F946 RIFLE not have the property values Emoji=Yes,
Emoji_Presentation=Yes.

Apple also suggests that U+1F93B MODERN PENTATHLON not have these property
values; Unicode 9.0 includes characters that enable use of a sequence to
represent this.

The Emoji and Emoji_Presentation values should be changed to a value that does
not suggest that availability and interchange as emoji is likely (e.g. a value
of No, or possibly the new value Provisional if accepted per L2/16087).