Comments on Public Review Issues

L2/16-205

Comments on Public Review Issues
(May 12 - August 2, 2016)

The sections below contain links to permanent feedback documents for the open Public Review Issues as well as other public feedback as of May 5, 2016, since the previous cumulative document was issued prior to UTC #147 (May 2016). Grayed-out items in the Table of Contents do not have feedback here.

Issue Name Feedback Link

329 Proposed Update UAX #44, Unicode Character Database (feedback) no feedback

328 Feedback on draft additional repertoire for ISO/IEC 10646:2016 (5th edition) DIS (feedback)

327 Feedback on draft additional repertoire for Amendment 1 (PDAM) to ISO/IEC 10646:2016 (5th edition) (feedback)

325 Proposed Update UTS #18, Unicode Regular Expressions (feedback)

The links below go to locations in this document for feedback.

Feedback to UTC / Encoding Proposals
Error Reports

Feedback to UTC / Encoding Proposals

Date/Time: Fri May 13 15:54:35 CDT 2016
Name: Michael Everson
Report Type: Feedback on an Encoding Proposal
Opt Subject: COPYLEFT

Do not call it the COPYLEFT SIGN. This English pun can't be translated. 
Call it the REVERSED COPYRIGHT SIGN please.

Date/Time: Tue May 17 18:12:54 CDT 2016
Name: Andrea Duffie
Report Type: Public Review Issue
Opt Subject: Dinosaur emojis

Dear Unicode,

I recently saw your proposed dinosaur emojis and wanted to encourage you, if
possible, to add additional dinosaur emojis in order to better represent the
diversity of the greatest animals our planet has ever known, as well as those
who love them.

I implore you to incorporate multiple dinosaur emojis into your diverse
offerings, as a single dinosaur isn't just under-representative of dinosaurs
in general, it does a disservice to dinosaur fans across the globe.

People's favorite dinosaurs are often as diverse as our personalities. Some of
us are raptor or T-Rex people, others are much happier with stegosaurs,
triceratops or brachiasaurs. Others prefer more niche species, but each
selection is capable of embodying a specific emotion, mindset or action based
on society's collective knowledge of dinosaurs.

In conclusion, I fully support Andrew West's proposed feedback in regards to a
dinosaur emoji set (http://www.unicode.org/L2/L2016/16103-jurassic-fdbk.pdf)
and encourage you to adopt the variety of silhouettes that is truly
representative of all of dinosauria, as well as the people who love them.
Thanks!

Date/Time: Wed May 18 19:16:11 CDT 2016
Name: Tanya
Report Type: Other Question, Problem, or Feedback
Opt Subject: Brown friendship, family emojis - iphone

Hi,
I was told to write to the Unicode Consortium regarding extending the skin tones of 
emojis to include all of the family, friendship emojis that are currently available.
Thank you for your consideration, or feedback regarding this issue.
Appreciatively,
Tanya C.

Date/Time: Tue May 24 10:40:08 CDT 2016
Name: Laura
Report Type: Other Question, Problem, or Feedback
Opt Subject: missing emojis

Hello, I am a happy Whatsapp user and a happy single mum. With my 5 years old
son we form a small, but happy family. Pretty disappointed to realize that in
Whatsapp 'politically correct' emojis, all sorts of combinations are
considered for a family, but not the one-parent ones (by choice, by life). I
wrote to Whatsapp about it, but they answered back that they do not create the
emojis, as they come from the standard consortium used most commonly. While
they appreciate feedback from users, they do not have the ability to make any
changes to the current list. This is the reason why I am writing this message:
can you please do something about it?
Thank you,
Laura Badano

Date/Time: Tue May 24 13:38:10 CDT 2016
Name: Courtney Milan
Report Type: Feedback on an Encoding Proposal
Opt Subject: Regarding the Jurassic emoji/dinosaur encoding project

Dear Unicode Technical Committee (and emoji subcommittee),

As the author of the original Jurassic Emoji proposal, L2/16-072, I'm writing
to provide further commentary on my dinosaur emoji proposal based on (a) the
public feedback that the UTC has received from both William Overington and
Andrew West, and (b) public comments from Andrew West and Ken Lunde suggesting
that the UTC is considering encoding only a single representative dinosaur.

Overington's suggestion was that the UTC encode 32 dinosaurs (!!) according to
scientific taxonomic classifications (!!!). This proposal appeals to my sense
of humor and love of dinosaurs, but having the number of encoded dinosaur
emoji exceed the number of living reptile emoji would be a suboptimal use of
Unicode resources and would place a substantial burden on member-implementers
to create emoji that will neither see much use nor provide important
communication tools to the public. It would also set an unfortunate precedent
for the UTC, lead to increasingly frivolous submissions, and distract from the
UTC's substantial and important other commitments to representing languages.

My reaction to West's feedback was initially something like this: 😍😍😍
(three heart-eyes emoji, if this doesn't come through on this form). I am a
fan of dinosaurs, particularly velociraptors. I would be utterly delighted to
have a standalone velociraptor emoji. That being said, the distinctions made
in West's proposal would be unlikely to be represented with any degree of
accuracy from emoji implementation to emoji implementation. The mission of the
UTC is not to encode everyone's favorite dinosaur. The differentiation
suggested by West is delightful to my dinosaur-loving heart, but rationally, I
cannot imagine that the difference between gallimmus and pachycephalosaurus
would be apparent in the 5 millimeter version. Nor do I think that
communication will be severely impeded by the lack of the gallimus.

My original proposal was for three dinosaurs, and I spent substantial time
whittling down the set of dinosaurs to what I felt was a close-to-minimum set
required for communication. Even though I am a massive fan of velociraptors
(cannot overestimate how massive a fan I am), I specifically chose not to
include both the T-Rex and the velociraptor in my proposal. The 5 millimeter
emoji for one can be used to represent the other.

My choice of three dinosaurs was not based on scientific classification, but
on projected communication and cultural meaning. Predatory dinosaurs occupy
different cultural and emotional significance than non-armored herbivores,
which occupy different representative space than armored dinosaurs. A single
representative dinosaur could not be used to show a pack of velociraptors
attacking a brontosaurus, because the emoji would not differentiate between
predator and prey. All activities for which this stands in as an allegory--
everything from revolutions to riots to internet mobs to journalists going
after an easy target--would not be representable in a single-dinosaur emoji
world. If all dinosaurs look alike, you can't tell that one of them is going
after the other.

Collapsing all dinosaurs into a single Barney-like dinosaur is too much
collapsing. This would make it impossible to represent dinosaur predator/prey
interactions and to use them as a  metaphor for the world in which we live.

At a very minimum, I believe that there should be representation for both
herbivore and predator dinosaurs. I urge the UTC to consider more than one
representative dinosaur: at least two (a representative predator and a
representative herbivore), and possibly three.

Thank you again for your time and consideration,
Courtney Milan

Date/Time: Wed Jul 6 03:31:00 CDT 2016
Name: Andrew Dunning
Report Type: Feedback on an Encoding Proposal
Opt Subject: N4704R Proposal

As a medievalist, I am thrilled to see the N4704R proposal
( http://www.unicode.org/L2/L2015/15327r-n4704-medieval-punct.pdf ). A number of
these characters are absolutely critical to communicating premodern Western
texts (e.g. the punctus elevatus is used practically everywhere from about
1000 to 1600), and their absence from Unicode has been inhibiting my work for
years; please ensure that this is brought to completion. I am happy to provide
advice on their usage if it should be desired.

Sincerely,

Andrew Dunning
Curator of Medieval Historical Manuscripts

The British Library

Date/Time: Wed Jul 13 09:58:22 CDT 2016
Name: Andrew West
Report Type: Error Report
Opt Subject: Inconsistent definition of emoji modifier base

Ed Note: This was also forwarded to the emoji subcommittee 2016/07/13.

It seems that there are only five emoji depicting people doing an activity
that are not emoji modifier bases:

U+26F7 SKIER
U+1F3C2  SNOWBOARDER
U+1F3C7  HORSE RACING
U+1F3CC  GOLFER
U+1F93A  FENCER

This is presumably because the people in these activities are usually covered
up so that their skin is not visible.  However, a survey of implementations
shows that this is not necessarily the case, and emoji for each of these four
characters may clearly show face, hand or arm skin (a fencer's free hand is
normally ungloved, as can be seen from a Google image search).  See the
examples at:

http://emojipedia.org/skier/ 
http://emojipedia.org/snowboarder/ 
http://emojipedia.org/horse-racing/ 
http://emojipedia.org/golfer/ 
http://emojipedia.org/fencer/ 

For consistency with other people emoji which may only show a small amount of
skin but which are emoji modifier bases (e.g. Sleuth or Spy), and to make the
user experience in selecting and using emoji less confusing, I suggest that
U+26F7, U+1F3C2, U+1F3C7, U+1F3CC and U+1F93A are all given the emoji modifier
base property, and the corresponding skin tone modifier sequences added to
emoji-sequences.txt.  If an implementation does not show any skin for any of
these emoji then any emoji modifier applied to it will simply have no affect.

Date/Time: Mon Jul 18 13:15:03 CDT 2016
Name: John Mayor
Report Type: Feedback on an Encoding Proposal
Opt Subject: (Q)OPPA/ KOPPA, AND LATIN CAPITAL AND REGULAR SUPERSCRIPT AND SUBSCRIPT "Qs"

Ed Note: the e-mail address given by the sender cannot be reached.

Dear Unicode Consortium Members!... an Open Letter!:... 

Would you please ensure that all of the large and small characters within the
"List of Unicode Characters (see, Wiki)" that can be listed as superscript and
subscript, are included within this List of Unicode Characters, as superscript
and subscript! In particular, as the SS/ SS letter "q" is THE ONLY Latin-based
"SS/ SS holdout" for inclusion within the Latin-based alphabet, I implore of
the Unicode Consortium Members, their due consideration of the incorporation
of this "delinquent letter" of the Latin-based alphabet, within the UC
"Unicode family" of Unicode characters! Furthermore!... and, in addition to
the inclusion of the Latin-based q!... and in the event that such is already
under review by UC Members!... I'd also ask-- should ask!-- that the large and
small Greek (Q)K_OPPA characters (not to be confused with the Greek Kappa
character!) be included, as well, within the List of Latin-based Unicode
Characters as an optional SS/ SS backup for q (i.e., inasmuch, as the KOPPA
character is presently in CHARACTER LIMBO!... i.e., it is not within the
contemporary Greek alphabet!... and so, can be used within the Latin-based
script as an optional SS/ SS q-- if not, a PERMANENT REPLACEMENT for q!)! And
lastly, as the just aforenoted Wiki List hasn't been amended since 2009!...
and as various Unicode issues may already have been addressed, and the
requisite Unicode augmentations and additions may already have been effected
within the UC!... I'd ask the UC Membership (as such would be more effectual
than a BLIND USER!) to undertake a Wickipedia amendment to this severely dated
denoted List! Thanks!! ----- Please!... no emails!... just resolve!

Date/Time: Tue Jul 26 08:18:45 CDT 2016
Name: Masaya Nakamura
Report Type: Feedback on an Encoding Proposal
Opt Subject: Comment on Hentaigana proposal L2/16-188

While most of the "mother ideographs" are shown in traditional form, 
#192 HENTAIGANA LETTER HO-8 is described as "Derived From 8C4A 豊", 
in simplified form. Is it intentional?

Date/Time: Tue Jul 26 12:41:52 CDT 2016
Name: John Cowan
Report Type: Feedback on an Encoding Proposal
Opt Subject: L2/16-174 Extra Aspect Symbols for Astrology

I just want to make sure the typo in the name of 
U+2BFC TRIAGLE WITH EXTENSION 
gets caught and fixed once and for all.  No more FHTORAs!

Date/Time: Tue Jul 26 14:14:34 CDT 2016
Name: John Cowan
Report Type: Feedback on an Encoding Proposal
Opt Subject: 16186-half-stars.pdf

In my view, the half-stars and the half-filled stars are never 
used in the same context, and should be left to fonts to distinguish.

Date/Time: Tue Aug 2 00:07:37 CDT 2016
Name: Cibu Johny
Report Type: Feedback on an Encoding Proposal
Opt Subject: On Virama in Vatteluttu proposal (L2/16-068)

The proposal mentions that "Historically, the script did not possess a virāma
or puḷḷi for silencing the inherent vowel, but such a sign was introduced at a
later time. Consonant clusters are represented linearly. In later sources,
each bare consonant in a cluster is marked by placing the virāma above the
letter."

However, I don't see any concrete evidence for Virama in the plates attached
with the proposal. Only entry I see is the table from Siromoney, et al 1976.
Even though it says, those characters are from Velvikudi inscription, the
scans of the original Velvikudi inscription is not attached to the proposal.
So, it is not clear which glyphs are from Velvikudi inscription and which are
Siromoney's annotations. My guess is only those letters inside the box are
from the inscription.

So my suggestion would be to encode only those characters that have clear
evidence from the historical plates. Add the remaining characters in the
future as the evidence is available.

Date/Time: Tue Aug 2 16:43:24 CDT 2016
Name: Anshuman Pandey
Report Type: Feedback on an Encoding Proposal
Opt Subject: Support for recommendations in L2/16-224

Hello,

For what it is worth, I am writing to support the recommendations made 
by Shriramana Sharma in L2/16-224 regarding the Grantha one-dotted nukta.

All the best,
Anshu

Feedback on UTRs / UAXes

Date/Time: Sun Jun 19 11:33:02 CDT 2016
Name: Bünzli Daniel
Report Type: Public Review Issue
Opt Subject: 9.0.0 segmentation and line breaks on the empty string

I notice that in 9.0.0, UAX29 segmentations no longer report boundaries on 
the empty string while UAX14 still does report a hard line break on it. 

Shouldn't UAX14 also report no breaks on the empty string ?

Date/Time: Wed Jun 22 04:29:58 CDT 2016
Name: Bünzli Daniel
Report Type: Public Review Issue
Opt Subject: UAX #29 9.0.0 Use sot rather than ^

Usage of ^ in the rules is a bit ambiguous since it could well mean that one
needs to detect start of lines. It seems that GB12 and WB15 could simply
replace ^ by sot.

Also the sentence "Grapheme cluster boundaries can be easily tested by looking
at immediately adjacent characters" is no longer true.

Best, 

Daniel

Date/Time: Sat Jul 23 17:45:24 CDT 2016
Name: Edwin Taylor
Report Type: Error Report
Opt Subject: Unicode® Technical Standard #39

I was reading through "Unicode® Technical Standard #39" and noticed a
potential problem with section 4.2 as seen here:

    http://www.unicode.org/reports/tr39/#Mixed_Script_Confusables 

The body text of "Example 2" is:

    The set of Cyrillic characters {я} does not have a whole-script confusable
    in Latin (there is no Latin character that looks like "я", nor does the
    set of Latin characters {o s t u y} have a whole-script confusable in
    Cyrillic (there is no Cyrillic character that looks like "t" or "u"). Thus
    this string is not a mixed-script confusable.

By counting the parentheses, you can see that there are two "(" characters but
only one ")", so the parentheses are not balanced.  My impression is that
there is a ")" character missing immediately before the the first comma.

As I am pointing this out, I should also say that I found that section
slightly hard to follow, possibly because I was misapplying a definition
earlier in the document which states "X and Y are mixed-script confusables if
they are confusable but they are not single-script confusables.".  If I can
suggest one way to help clarify the examples in section 4.2, it would be to
add a new example between 2 and 3 which deals with the string "toyѕ-я-uѕ",
which uses the Cyrillic letter "ѕ".  I believe this would be a "mixed-script
confusable", and makes an easier contrast against examples 1 and 2, for
completeness.

Finally, as a separate matter, I would like to point out that this online form
(which requires me to submit personally identifiable information) is not
available (and does not send data) over HTTPS.  This may not be the correct
place to report that, but I hope you don't mind me pointing it out.

Thank you for your time and good work in supporting the Unicode standards.

Error Reports

Date/Time: Thu May 12 13:52:44 CDT 2016
Name: Roozbeh Pournader
Report Type: Error Report
Opt Subject: Indic Syllabic Category of Khamti logograms U+AA74..U+AA76

Based on comments I received from Martin Hosken through Behdad Esfahbod (see
https://github.com/roozbehp/unicode-data/issues/3 ), the three Khamti logograms
take tone marks, so they should have an Indic Syllabic Category.

Here is the information from the original proposal, at
http://www.unicode.org/L2/L2008/08276-khamti-proposal.pdf :

"Three logogram characters are also used which can take tone and whose meaning
is according to the tone they take. They are used when transcribing speech
rather than in formal writing. For example, ˀn takes three tones and means:
ꩵႈ negative, ꩵႉ giving and ꩵး yes. hm also takes three different tones and
means: ꩶႚ part of no (prefixed by hm negative), ꩶႊ question response marker,
ꩶး there. Oay takes two tones and is used when addressing a loved one ꩴႊ or
someone far away ꩴး."

Based on the information, I believe we need to give the character an Indic
Syllabic Category of either Consonant or Consonant_Placeholder. It appears to
me that Consonant_Placeholder may be a better class, similar to U+104E MYANMAR
SYMBOL AFOREMENTIONED.

Date/Time: Thu Jun 2 08:46:28 CDT 2016
Name: David Corbett
Report Type: Error Report
Opt Subject: Word_Break of U+02D7

U+02D7 MODIFIER LETTER MINUS SIGN’s Word_Break is MidLetter but should be
ALetter. Like other IPA modifier letters, it follows the letter it modifies;
it does not need a letter after it. Maybe its General_Category should be
changed to Modifier_Letter to match U+02D0 MODIFIER LETTER TRIANGULAR COLON.

Date/Time: Fri Jun 10 19:51:43 CDT 2016
Name: Ken Lunde
Report Type: Error Report
Opt Subject: U+1D378 TALLY MARK FIVE representative glyph issue

This is NOT a Unicode Version 9.0 issue.

The representative glyph for U+1D378 TALLY MARK FIVE in L2/16-171 (aka WG2
N4729) has two errors, the first of which is critical. The critical error is
that the representative glyph has five vertical strokes, but it should have
only four. The less critical error is that representative glyph exhibits
overlapping subpaths that appear as holes at particular resolutions. See page
49: http://www.unicode.org/L2/L2016/16171-n4729-dis5th-amd1.pdf#page=49 

The representative glyphs from the original proposals, L2/15-328 and
L2/16-065, as used in the documents and as found in the OpenType/CFF fonts
that were attached to them did not exhibit these issues.

These issues have been relayed to Michael Everson, and are being reported here
for the record.

Issue	Name	Feedback Link
329	Proposed Update UAX #44, Unicode Character Database	(feedback) no feedback
328	Feedback on draft additional repertoire for ISO/IEC 10646:2016 (5th edition) DIS	(feedback)
327	Feedback on draft additional repertoire for Amendment 1 (PDAM) to ISO/IEC 10646:2016 (5th edition)	(feedback)
325	Proposed Update UTS #18, Unicode Regular Expressions	(feedback)

L2/16-205