Public Review Issues

Accumulated Feedback on PRI #350

This page is a compilation of formal public feedback received so far. See Feedback for further information on this issue, how to discuss it, and how to provide feedback.

Date/Time: Wed Mar 8 05:46:23 CST 2017
Name: Christoph Päper
Report Type: Public Review Issue
Opt Subject: Unicode 10.0 Fantasy Creature Emojis U+1F9D9–F

(I thought I had submitted this feedback in November 2016, 
but apparently I can find no proof of that.)

In [L2/16-304](http://www.unicode.org/L2/L2016/16304-fantasy-creatures.pdf),
the Emoji Subcommittee (ESC) and Tayfun Karadeniz  of Emoji Xpress (who run
) proposed several popular creatures from (primarily
European) fantasy fiction. Several of them have been approved at UTC149 for
inclusion in a future version of the Unicode Standard, probably 10.0 due to be
released in June 2017. The prospective code points and names are these:

- U+1F9D9	🧙 🧙	mage	MAGE
- U+1F9DA	🧚 🧚	fairy	FAIRY
- U+1F9DB	🧛 🧛	vampire	VAMPIRE
- U+1F9DC	🧜 🧜 🧜	merperson	MERPERSON
- U+1F9DD	🧝 🧝	elf	ELF
- U+1F9DE	🧞 🧞	genie	GENIE
- U+1F9DF	🧟 🧟	zombie	ZOMBIE

The lowercase short names from [Emoji 5.0 beta]
( http://www.unicode.org/emoji/charts-beta/emoji-released.html#person-role ) 
are currently the same as the character names intended for [Unicode 10.0]
( http://www.unicode.org/alloc/Pipeline.html ).

I would like to suggest to change the Unicode designations (but not the emoji
short names) to more descriptive, less ambiguous terms. For instance, in some
folklores and languages elves and fairies are basically indistinguishable.
Alsom it avoids the uncommon, almost revisionist term ‘merperson’.

- U+1F9D9	🧙 🧙	mage	PERSON WITH POINTED HAT
- U+1F9DA	🧚 🧚	fairy	PERSON WITH [INSECTOID] WINGS
- U+1F9DB	🧛 🧛	vampire	PERSON WITH FANGS
- U+1F9DC	🧜 🧜 🧜	merperson	PERSON WITH FLUKE [FOR LEGS]
- U+1F9DD	🧝 🧝	elf	PERSON WITH POINTED EARS
- U+1F9DE	🧞 🧞	genie	PERSON EMERGING FROM LANTERN / PERSON WITH BLUE SKIN (Disney bias)
- U+1F9DF	🧟 🧟	zombie	[WOUNDED] PERSON WITH LIFELESS EYES / PERSON WITH GREEN SKIN / UNDEAD PERSON / WALKING CORPSE

At the very least, the essential visual detail that justified their addition
to the standard should be mentioned in an annotation.

By the way, I sincerely believe that a Magic Wand emoji would be used much
more than a Mage emoji. <https://github.com/Crissov/unicode-proposals/issues/160>

Kind regards

    Christoph Päper

Date/Time: Sat Mar 11 21:36:43 CST 2017
Name: Sarabveer Singh
Report Type: Feedback on an Encoding Proposal
Opt Subject: Alternative Form of 0A75 GURMUKHI SIGN YAKASH in Unicode 10.0

As per Unicode Committee Recommendations, a note has been added to the
Gurmukhi Character Chart for Unicode 10
(http://www.unicode.org/Public/10.0.0/charts/blocks/U0A00.pdf) for YAKASH,
stating there is a alternative glyph. It states as follows:

0A75  ੵ GURMUKHI SIGN YAKASH 
• some fonts use an alternate glyph shaped more like the lower part of 0A2F ਯ

I ask that an example of the alternative glyph be added to the to the Gurmukhi
Character Chart PDF and other Unicode Charts and Documentation for the
Gurmukhi Unicode Block.

An example of the alternative glyph for Yaskash can be found in the AnmolLipi
font which can be downloaded from (http://www.gurbanifiles.org). The direct
link for the font is here: (http://www.gurbanifiles.org/unicode/AnmolUni.ttf).
The font is licensed under the GNU GPL v2. The Alternative Yakash Glyph can be
found in the same character code as the Gurmukhi Yakash at U+0A75.

Thanks,
Sarabveer Singh

Date/Time: Tue Mar 14 15:07:46 CDT 2017
Name: David Corbett
Report Type: Public Review Issue
Opt Subject: PRI #350: Code chart subheader corrections

U+23FF OBSERVER EYE SYMBOL should not be under the subheader 
“Power symbol from IEEE 1621-2004”.

U+2E44 DOUBLE SUSPENSION MARK should be under the subheader 
“Miscellaneous punctuation”, not under “Typicon punctuation”.

U+312E BOPOMOFO LETTER O WITH DOT ABOVE should not be under the subheader 
“Miscellaneous additions”, since it is not an addition to the script but 
the original form of U+311C.

Date/Time: Thu Mar 16 20:12:02 CDT 2017
Name: Christoph Päper
Report Type: Public Review Issue
Opt Subject: PRI#350 Chiral/Paired Clothing Emojis

Unicode 10.0 is about to introduce two emojis that show pieces of clothing
that come in pairs of two equal, but chiral items, U+1F9E4 Gloves and U+1F9E6
Socks. All existing clothing emojis that exist for left and right hand or foot
have singular names and reference glyphs:

- U+26F8  Ice Skate
- U+1F3BF Ski and Ski Boot
- U+1F45E Mans Shoe
- U+1F45F Athletic Shoe
- U+1F460 High-Heeled Shoe
- U+1F461 Womans Sandal
- U+1F94A Boxing Glove

The only exception is

- U+1F462 Womans Boots

Chiral body parts are also encoded as a single one:

- U+1F91B/C Left-/Right-Facing Fist
- U+1F442 Ear
- U+1F4AA Flexed Biceps
- all hand signs (unless mentioned below)

except for hands that are performing a gesture or action which requires both
of them:

- U+1F44F Clapping Hands Sign
- U+1F450 Open Hands Sign
- U+1F64C Person Raising Both Hands in Celebration
- U+1F64F Person with Folded Hands
- U+1F932 Palms Up Together

Marks left by feet walking are similar to this and therefore also appear in
pairs:

- U+1F43E Paw Prints
- U+1F463 Foot Prints

Emojis of two hand-hold items would actually involve the same relative side
hand of two individuals, so are not chiral:

- U+2694  Crossed Swords
- U+1F37B Clinking Beer Mugs
- U+1F38C Crossed Flags
- U+1F91D Handshake
- U+1F93C Wrestlers
- U+1F942 Clinking Glasses

The only other exception for paired body parts is:

- U+1F440 Eyes

but it also exists in a single variant:

- U+1F441 Eye

In conclusion, there is clear precedence for preferring single-sided U+1F9E4
Glove and U+1F9E6 Sock instead!

Implementations could show the mirrored default glyph in certain character
sequences (e.g. ears on both sides of a facial emoji), and should display a
chiral pair if one of the characters mentioned above is reduplicated. That
means, a user who wanted to display a pair of socks would need to input
U+1F9E6+1F9E6 or, to make their intent even more clear, add a ZWJ,
U+1F9E6+200D+1F9E6. The opposite, requesting a single item from a double-item
character, would be much harder and less intuitive to achieve.

Proposed changes:

- Rename U+1F9E4: GLOVES ----> GLOVE
- Rename U+1F9E6: SOCKS  ----> SOCK
- Change both reference glyphs accordingly.

Date/Time: Thu Mar 16 20:58:19 CDT 2017
Name: Christoph Päper
Report Type: Public Review Issue
Opt Subject: PRI#350 Human Emojis without Specified Gender

The code point order of existing human emojis Baby, Boy/Girl, Man/Woman, Older
Man/Woman is ascending by age (U+1F465-9,74+5). Child, Adult and Older Adult
(U+1F9D1-3) should also be.

Age is often used as an indicator of proficiency. This should be noted as a
possible use.

This tripartite division (or quadripartite with Baby) is a rather arbitrary
choice and not a cultural universal. Characters for Toddler,
Teen(ager)/Pubescent/Junior, Twen(tysomething)/Young(er) Adult, Middle-Aged
/Best-Ager/Silver-Ager, Old Adult/Elder or others might have made (or still
make) sense, but since this is neither the place nor the time to propose such,
I will only note that a relative term like "Older" may be too ambiguous for
future developments.

Proposed changes:

- Alter the code point of Adult: U+1F9D1 ----> U+1F9D2 
- Alter the code point of Child: U+1F9D2 ----> U+1F9D1 
- Rename code point U+1F9D3: Older Adult ----> Senior
- Add annotation to U+1F9D1 Child: beginner level
- Add annotation to U+1F9D2 Adult: advanced level (or proficient level)
- Add annotation to U+1F9D3 Senior: expert level

Date/Time: Fri Mar 17 08:12:48 CDT 2017
Name: Andrew West
Report Type: Public Review Issue
Opt Subject: Feedback on Unicode 10.0 beta: 1F996 T-REX


The name T-REX for 1F996 is inappropriate for an international character
encoding standard, as it does not reflect the correct and commonly used
scientific name for this dinosaur (Tyrannosaurus rex or abbreviated as T. rex,
but never "T-rex"). Although :t-rex: or :trex: may be suitable names for emoji
input methods, the formal character name should be the full scientific name,
TYRANNOSAURUS REX, or a generic name such as TYRANNOSAUR to indicate that the
emoji does not just represent a single species of Tyrannosaur (cf. SAUROPOD
for 1F995). Therefore change the name for 1F996 to TYRANNOSAURUS REX or
TYRANNOSAUR.

Date/Time: Fri Mar 17 08:34:09 CDT 2017
Name: Andrew West
Report Type: Public Review Issue
Opt Subject: Feedback on Unicode 10.0 beta: 1F3B1 BILLIARDS

It is an extremely bad idea to change the representative glyph for U+1F3B1
BILLIARDS from a picture showing generic set of billiard balls and a cue to a
picture of an "8-ball", as this redefines the meaning of the character, and
makes the character unusable for its original purpose. 1F3B1 was encoded in
the first place for compatibility with Japanese emoji where the character
represented a generic billiards-type game, and the Unicode character has been
used to represent any cue sport, such as Billiards, Snooker and Pool. However,
the 8-ball is only used in one particular type of cue sport, Pool or Pocket
Billiards, and cannot be used to represent the game of Snooker which is more
widely played internationally than the game of Pool. Moreover, the 8-ball is
not even primarily intended to represent the game of pool, but reflects a
fortune-telling device known as the "Magic 8-Ball".

There are emoji for a wide range of sporting activities, and the UTC has gone
to some lengths to ensure that major international sports are represented as
emoji characters. Hijacking the Billiards character to represent a magic
8-ball runs contrary to this effort and sets a very bad precedent. If the UTC
allows incorrect implementations of the design of 1F3B1 by vendors to redefine
the meaning of this character, not only will followers of Snooker and other
cue sports be deprived of the ability to represent their favoured sport on
social media, but it will invalidate existing data that uses the character
with its original and correct meaning.

Therefore I strongly urge the UTC to keep the original glyph design for 1F3B1,
and encode in Unicode 10.0 as a matter of urgency an additional new character
named EIGHT BALL with an 8-ball glyph, and encourage vendors to use the new
character for an 8-ball emoji and the existing 1F3B1 character as a generic
billiards emoji.

Date/Time: Fri Mar 17 09:05:06 CDT 2017
Name: Andrew West
Report Type: Public Review Issue
Opt Subject: Feedback on Unicode 10.0 beta: Hentaigana

I strongly oppose the fast-tracking of 285 hentaigana into Unicode 10.0, out
of synchronization with 10646:2017. The hentaigana characters are part of
10646:2017 Amd. 1 which has not yet completed its PDAM ballot. These
characters are not entirely uncontroversial, as indicated by the long time it
has taken to reach this stage, and there are ballot comments requesting name
changes for them from the Irish NB for the current PDAM1.3 ballot that closed
on 7 March 2017. In principle, and out of respect for the ISO ballot process
and for national bodies participating in SC2, the UTC should not seek to fast-
track into the Unicode Standard any characters that are still at the Committee
stage.

It has taken many years to get Hentaigana accepted for encoding (since 2009),
and it will not harm to wait until the ISO ballot process has completed, and
then include them in Unicode 11.0 next year. Prior to Unicode 9.0 the UTC has
only fast-tracked urgently-needed characters such as officially-used currency
symbols and emoji, but in 9.0 and 10.0 complete sets of characters have been
fast-tracked before they have completed their ISO ballot cycle, thereby
curtailing the possibility of SC2 national bodies making technical changes to
character repertoire and character names. New scripts and sets of characters
are never so urgently required that they need to be fast-tracked into an early
version of Unicode, and thereby risk technical mistakes from being left
undiscovered until it is too late. I therefore urge the UTC to withdraw
Hentaigana from Unicode 10.0, and allow the ISO ballot process to follow its
due course before adding them to Unicode 11.0 next year.

I would further recommend that in future the UTC only fast-tracks those
characters that have completed the CD or PDAM ballot stage with no unaccepted
ballot comments.

Date/Time: Sat Mar 18 17:01:20 CDT 2017
Name: Christoph Päper
Report Type: Public Review Issue
Opt Subject: PRI#350 Additions to Transport and Map Symbols block (U+1F680..FF)

Allegedly, Unicode (or ISO) prefers British English terms and spellings over
American ones. If that was true, and I am not sure it is, U+1F6F7 should
probably be called SLEDGE instead of SLED.

Not least since U+1F47D EXTRATERRESTRIAL ALIEN was a unification of alien
lifeform and spaceship glyphs in the original Japanese carrier emoji sets (see
e.g. L2/08-080), U+1F6F8 FLYING SAUCER should have annotation linking it to
U+1F47D and perhaps also to U+1F47E ALIEN MONSTER. It should also have
annotation (in TUS, not just CLDR) for the alias "UFO".

Date/Time: Sat Mar 18 20:52:53 CDT 2017
Name: Charlotte Buff
Report Type: Public Review Issue
Opt Subject: PRI #350: Suboptimal Emoji Character Names

Several of the new emoji characters in Unicode 10 have formal identifiers that 
are problematic in my opinion:

• GRINNING FACE WITH STAR EYES is inconsistent with the already existing
SMILING FACE WITH HEART-SHAPED EYES and SMILING CAT FACE WITH HEART-SHAPED
EYES. It should be changed to GRINNING FACE WITH STAR-SHAPED EYES.

• FACE WITH ONE EYEBROW RAISED and FACE WITH MONOCLE don’t specify the shape
of the mouth. The UTC should review whether this could cause implementation
problems between different vendors.

• FACE WITH ONE EYEBROW RAISED and FACE WITH FINGER COVERING CLOSED LIPS don’t
specify the shape of the eyes. The UTC should review whether this could
cause implementation problems between different vendors.

• I LOVE YOU HAND SIGN should not be named after one meaning in one specific
context. This has caused problems in the past, for example with CALL ME HAND
and the ongoing debate on whether it can also represent the ‘shaka’ gesture
or not. Emoji of hand gestures should be named after their shape, in this
case something like RAISED HAND WITH THUMB AND INDEX AND SMALL FINGERS
EXTENDED.

• T-REX is a very informal, almost silly name for use in such a serious
standard. A better name would be TYRANNOSAURUS or TYRANNOSAURUS REX. Even
better would be a more general name like THEROPOD to better complement
SAUROPOD. It is strange that one of the two dinosaur emoji stands for an
entire group of diverse taxa while the other only includes one single
species.

• FLYING SAUCER similarly is a very informal name. Perhaps something like
ALIEN SPACECRAFT would be more appropriate, especially since by far not all
U.F.O. sightings involve saucer-shaped objects.

• Should the name of SANDWICH reflect what kind of sandwich it is? Chart glyph
suggests that it is made from two slices of white bread, but submarine
sandwiches or open sandwiches are also very popular. May cause confusion
between implementations. There are some old emoji documents in the registry
that discuss a potential character called ‘LONG SANDWICH’.

Date/Time: Tue Mar 21 22:00:13 CDT 2017
Name: Christoph Päper
Report Type: Public Review Issue
Opt Subject: PRI#350 EmojiSources.txt Keycap Sequences

EmojiSources-10.0.0.txt:

> ># Keycap sequences for telephone keypad.
> ># The following 11 mappings are historical. The combining character sequences
> ># in these mappings do not include variation selectors for emoji presentation.
> ># Thus they do not match the named character sequences with keycaps listed in
> ># NamedSequences.txt.
> ># For modern data used in emoji support, see http://www.unicode.org/Public/emoji/latest/

It should be perfectly fine to use emoji sequences which are not fully qualified, assuming that 
higher-level markup specifies emoji presentation.

VS-16 is not included for single-codepoint entries either, e.g. Red Heart:

> > 2764;F991;F7B2;F962

Date/Time: Wed Mar 22 10:32:45 CDT 2017
Name: John Cowan
Report Type: Error Report
Opt Subject: All word separators should be Zs

Most word separator characters that put ink on the screen/paper currently have
general category Po.  This is inconsistent with the treatment of OGHAM SPACE
MARK, U+1680, which also displays ink but is considered Zs.  For consistency,
one or the other should be changed.  I recommend that the characters listed
below be changed to Zs, thus defining Zs by the functional property of word
separation rather than the visual property of being inkless.

U+1361 ETHIOPIC WORDSPACE
U+2E31 WORD SEPARATOR MIDDLE DOT
U+10100 AEGEAN WORD SEPARATOR DOT
U+10101 AEGEAN WORD SEPARATOR LINE
U+1039F UGARITIC WORD DIVIDER
U+103D0 OLD PERSIAN WORD DIVIDER
U+1091F PHOENECIAN WORD SEPARATOR
U+1123A KHOJKI WORD SEPARATOR
U+11C43 BHAIKSUKI WORD SEPARATOR
U+12470 CUNEIFORM PUNCTUATION SIGN OLD ASSYRIAN WORD DIVIDER'

In the alternative, U+1680 could be changed to Po, but this would make it more
difficult to divide words based on general categories.  I have omitted U+16EB
RUNIC SINGLE PUNCTUATION from this list because it has functions other than
mere word separation.

Date/Time: Thu Mar 23 11:52:06 CDT 2017
Name: Ken Lunde
Report Type: Public Review Issue
Opt Subject: PRI #350 (Unicode Version 10.0 Beta): U+3031 through U+3035
need better annotations in the code charts

In case it's not too late for Unicode Version 10.0:

The representative glyphs for U+3031 VERTICAL KANA REPEAT MARK and U+3032
VERTICAL KANA REPEAT WITH VOICED SOUND MARK are somewhat misleading, because
they should be implemented as glyphs that are two-em tall, according to
Japanese typographic conventions, and are used exclusively for vertical
writing. Because changing the representative glyphs to be two-em tall is
likely to break the code charts, I instead recommend that U+3031 through
U+3035 be annotated as follows:

U+3031 & U+3032: These characters are used only in vertical writing, are
implemented as glyphs that are two-em tall, and are precomposed versions of
the sequences <U+3033, U+3035> and <U+3034, U+3035>, respectively.

U+3033 & U+3034: These characters represent the first character of two-
character sequences whose second character is U+3035, and which correspond to
U+3031 and U+3032, respectively.

Kazuraki SP2N Light, Source Han Sans, and Noto Sans CJK are font
implementations that support U+3031 through U+3035 as described above.

Date/Time: Mon Mar 27 2017
Name: Sarabveer Singh
Report Type: Public Review Issue (PRI #350)
Opt Subject: Alternative Form of 0A75 GURMUKHI SIGN YAKASH in Unicode 10.0

As per Unicode Committee Recommendations, a note has been added to the Gurmukhi Character Chart for Unicode 10 (http://www.unicode.org/Public/10.0.0/charts/blocks/U0A00.pdf) for YAKASH, stating there is a alternative glyph. It states as follows:

0A75 ੵ GURMUKHI SIGN YAKASH

• some fonts use an alternate glyph shaped more like the lower part of 0A2F ਯ

I ask that an example of the alternative glyph be added to the to the Gurmukhi Character Chart PDF and other Unicode Charts and Documentation for the Gurmukhi Unicode Block.

An example of the alternative glyph for Yaskash can be found in the AnmolLipi font which can be downloaded from http://www.gurbanifiles.org. The direct link for the font is here: http://www.gurbanifiles.org/unicode/AnmolUni.ttf. The font is licensed under the GNU GPL v2. The Alternative Yakash Glyph can be found in the same character code as the Gurmukhi Yakash at U+0A75.

From,

Sarabveer Singh

Date/Time: Wed Apr 5 04:23:35 CDT 2017
Name: Andrew West
Report Type: Public Review Issue
Opt Subject: Feedback on Unicode 10.0 beta: U+1F931 BREAST-FEEDING

Emoji 5.0 defines the new Unicode 10.0 emoji character U+1F931 BREAST-FEEDING as having five skin tone sequences:

1F931 1F3FB   ; Emoji_Modifier_Sequence   ; breast-feeding: light skin tone                                # 10.0  [1] (🤱🏻)
1F931 1F3FC   ; Emoji_Modifier_Sequence   ; breast-feeding: medium-light skin tone                         # 10.0  [1] (🤱🏼)
1F931 1F3FD   ; Emoji_Modifier_Sequence   ; breast-feeding: medium skin tone                               # 10.0  [1] (🤱🏽)
1F931 1F3FE   ; Emoji_Modifier_Sequence   ; breast-feeding: medium-dark skin tone                          # 10.0  [1] (🤱🏾)
1F931 1F3FF   ; Emoji_Modifier_Sequence   ; breast-feeding: dark skin tone                                 # 10.0  [1] (🤱🏿)

This is very problematic as the Breast-Feeding emoji is typically represented 
as a woman breast-feeding a baby (indeed it would be hard to design a glyph 
which did not have both woman and baby), and babies do not necessarily have 
the same skin tone as the woman doing the breast-feeding. Users will be confused 
and angry that they cannot individually specify the skin tone of both people 
depicted in this emoji.

This same issue has already been encountered with other multi-person emoji, and 
was addressed in L2/16-332 "Remove multi-person emoji from Emoji_Modifier_Base", 
but the implications of this issue on U+1F931 BREAST-FEEDING were clearly 
overlooked (perhaps because the character name indicates an activity rather 
than the people involved in the activity). I believe that this problem 
indicates that a longer and more comprehensive review process for emoji, 
involving both public review and review by SC2, is warranted, and the 
fast-tracking of emoji into the Unicode Standard at breakneck speed should 
be slowed down.

Date/Time: Thu Apr 6 08:02:55 CDT 2017
Name: Andrew West
Report Type: Public Review Issue
Opt Subject: Feedback on Unicode 10.0 beta: U+1F486 FACE MASSAGE

U+1F486 FACE MASSAGE 💆 is represented as a person with their face or head
being massaged by two hands, which from their angle are clearly not the hands
of the person receiving the face massage. Emoji 5.0 data
(http://unicode.org/Public/emoji/5.0/emoji-sequences.txt) defines U+1F486 as a
base character for skin tone modification, but as the skin tone modification
applies equally to all exposed skin in the emoji, implementers apply the same
skin tone modification to both the face of the person getting the massage and
to the hands of the person giving the massage (see e.g. http://emojipedia.org
/face-massage-type-6/). This means that inter-racial massages are not
supported by Unicode emoji, which is extremely problematic. The issue of
racially segregated multi-person emoji was addressed in L2/16-332 "Remove
multi-person emoji from Emoji_Modifier_Base", but the fact that this issue
also applies to U+1F486 FACE MASSAGE was apparently not noticed.

Simply removing Emoji_Modifier_Base from U+1F486 FACE MASSAGE and U+1F931
BREAST-FEEDING is not an ideal solution, and I think that defining a mechanism
for applying multiple emoji modifiers to a single multi-person emoji character
would be a better approach.

Date/Time: Sat Apr 8 06:42:47 CDT 2017
Name: Charlotte Buff
Report Type: Public Review Issue
Opt Subject: PRI #350: Error in Miscellaneous Technical chart

In the code chart for the Miscellaneous Technical block under the subhead
‘Scan lines for terminal graphics’ (U+23BA–U+23BD) it is mentioned that even-
numbered scan lines have been unified with box-drawing graphics. However,
there appear to be no characters in the Box Drawing block that could qualify
as scan lines, with the exception of U+2500 ─ BOX DRAWINGS LIGHT HORIZONTAL
which was unified with scan line-5 (obviously not an even number). I suggest
that either the phrase ‘even-numbered’ be removed from the chart or—if the
remaining scan lines indeed do exist as encoded characters and I just haven’t
found them—cross references to these characters be added for clarity.

Date/Time: Wed Apr 12 20:31:39 CDT 2017
Name: Marcel Schneider
Report Type: Public Review Issue
Opt Subject: PRI#350 Code chart subheader corrections

I apologize for not having properly reported some item, as that had already 
been reported by David Corbett, I just discovered, having overlooked it the 
first reading, because I sent feedback while working on the files for another 
purpose.

By this way, for U+23FF Iʼd suggest "Optics symbol" as a subhead, as Iʼve 
actually set it elsewhere. Since we thus need a dedicated subhead for U+312E 
too, Iʼd suggest "Archaic letter".

Best regards,
Marcel

Date/Time: Thu Apr 27 23:19:56 CDT 2017
Name: Weizhe Zheng
Report Type: Public Review Issue
Opt Subject: PRI 350: second medial form of 1820

The glyph for the second medial form of 1820 on the Unicode 10.0 Beta code chart 
is a mistake. The correct glyph is 
  http://www.unicode.org/cgi-bin/varglyph?24-1820-180B-medi
The mistake first appeared on the Unicode 9.0 code chart and was reported in 
Section 2 of L2/16-292. Apparently the compiler of the Unicode 9.0 code chart 
mistook the third medial form for the second medial form. Unfortunately the 
error persisted in 10.0 Beta.

Date/Time: Fri Apr 28 03:19:57 CDT 2017
Name: Ben Longbons
Report Type: Error Report
Opt Subject: Unsorted data files

When UCD files are sorted, it allows more efficient processing. Most UCD files 
*are* sorted, at least within a property value.

The following files are not sorted according to any machine-detectable algorithm, 
as of Unicode 9.0.0:

CompositionExclusions.txt (groups in comments)
NameAliases.txt (appears accidental, can be fixed easily)
NamedSequences.txt (groups with no order at all)
PropertyAliases.txt (groups in comments)
PropertyValueAliases.txt (values are sorted, but properties are not *quite*. appears accidental, can be fixed easily)
SpecialCasing.txt (groups in comments, but not sorted within groups)
StandardizedVariants.txt (groups in comments, but not sorted within groups)

Date/Time: Sat Apr 29 18:00:39 CDT 2017
Name: Charlotte Buff
Report Type: Error Report
Opt Subject: Contradictory Statements in UAX #15

Unicode Standard Annex #15 (“Unicode Normalization Forms”) contains two
passages concerning composition exclusions that seem to directly contradict
each other.

Section 3 “Versioning and Stability” states the following:

    »It would be possible to add more compositions in a future version of
    Unicode, as long as the backward compatibility requirement is met. It
    requires that for any new composition XY → Z, at most one of X or Y was
    defined in a previous version of Unicode. That is, Z must be a new
    character, and either X or Y must be a new character. However, the
    Unicode Consortium strongly discourages new compositions, even in such
    restricted cases.«

According to this a hypothetical newly encoded character that is canonically
decomposable can (but need not) be composable under NFC and NFKC if at least
one of the characters in its decomposition mapping was also added in the same
version of Unicode.

However, in section 5 “Composition Exclusion Table” the following is written:

    »When a character with a canonical decomposition is added to Unicode, it
    must be added to the composition exclusion table if there is at least one
    character in its decomposition that existed in a previous version of
    Unicode. If there are no such characters, then it is possible for it to
    be added or omitted from the composition exclusion table. The choice of
    whether to do so or not rests upon whether it is generally used in the
    precomposed form or not.«

According to this *all* characters in the decomposition mapping must be
entirely new to be eligible for composition, which obviously is not the same
as saying that one character can be old as long as the other one is new.

As of Unicode 9.0.0 there are 23 canonically decomposable characters added
after version 3.0.0 that are not excluded from composition:

• U+1B06 ᬆ BALINESE LETTER AKARA TEDUNG
• U+1B08 ᬈ BALINESE LETTER IKARA TEDUNG
• U+1B0A ᬊ BALINESE LETTER UKARA TEDUNG
• U+1B0C ᬌ BALINESE LETTER RA REPA TEDUNG
• U+1B0E ᬎ BALINESE LETTER LA LENGA TEDUNG
• U+1B12 ᬒ BALINESE LETTER OKARA TEDUNG
• U+1B3B ◌ᬻ BALINESE VOWEL SIGN RA REPA TEDUNG
• U+1B3D ◌ᬽ BALINESE VOWEL SIGN LA LENGA TEDUNG
• U+1B40 ◌ᭀ BALINESE VOWEL SIGN TALING TEDUNG
• U+1B41 ◌ᭁ BALINESE VOWEL SIGN TALING REPA TEDUNG
• U+1B43 ◌ᭃ BALINESE VOWEL SIGN PEPET TEDUNG
• U+1109A 𑂚 KAITHI LETTER DDDHA
• U+1109C 𑂜 KAITHI LETTER RHA
• U+110AB 𑂫 KAITHI LETTER VA
• U+1112E ◌𑄮 CHAKMA VOWEL SIGN O
• U+1112F ◌𑄯 CHAKMA VOWEL SIGN AU
• U+1134B ◌𑍋 GRANTHA VOWEL SIGN OO
• U+1134C ◌𑍌 GRANTHA VOWEL SIGN AU
• U+114BB ◌𑒻 TIRHUTA VOWEL SIGN AI
• U+114BC ◌𑒼 TIRHUTA VOWEL SIGN O
• U+114BE ◌𑒾 TIRHUTA VOWEL SIGN AU
• U+115BA ◌𑖺 SIDDHAM VOWEL SIGN O
• U+115BB ◌𑖻 SIDDHAM VOWEL SIGN AU

None of these decompose to characters that existed previously so it remains
unclear which version of the clause actually applies. What makes the matter
more confusing is that U+2ADC ⫝̸ FORKING is listed as an example of a
composition-excluded character whose decomposition existed in a previous
version of Unicode, even though the first character in its decomposition
mapping, U+2ADD ⫝ NONFORKING, was added at the exact same time (Unicode
3.2.0). FORKING is the only Post Composition Version precomposed character
excluded from composition that actually has a decomposition including a
previously existing character (U+0338 ◌̸ COMBINING LONG SOLIDUS OVERLAY).

I suggest that the UTC review the passages in question and prepare an updated
version of UAX #15 that clarifies this discrepancy.

Date/Time: Sun Apr 30 08:30:10 CDT 2017
Name: Marcel Schneider
Report Type: Error Report
Opt Subject: NʼKo Diacritics

Hello,

I have got a number of feedback items that do not go under the beta feedback,
but are nevertheless for considertaion of the on-coming UTC meeting. Sorry to
be late, Iʼve been busy with French character names to document the keyboard
layouts Iʼm working on, for release under the ending government in case our
country falls into instability.

First: NʼKo diacritics seem being divided into three categories, that actually
are unified under a single subheader: (Sorry to have it in French only, as I
canʼt actually spend the time to translate it to English.)

> Three different kinds of marks are under a single subhead.
> Here is how Iʼll propose them from now on:

> @ Marques tonales
> 07EB MARQUE N’KO DE TON BREF HAUT
> x (diacritique macron - 0304)
> 07EC MARQUE N’KO DE TON BREF BAS
> x (diacritique tilde - 0303)
> 07ED MARQUE N’KO DE TON BREF MONTANT
> x (diacritique point en chef - 0307)
> 07EE MARQUE N’KO DE TON LONG DESCENDANT
> x (diacritique accent circonflexe - 0302)
> 07EF MARQUE N’KO DE TON LONG HAUT
> 07F0 MARQUE N’KO DE TON LONG BAS
> 07F1 MARQUE N’KO DE TON LONG MONTANT
> ;0
> @ Diacritiques
> 07F2 DIACRITIQUE N’KO DE NASALISATION
> x (diacritique point souscrit - 0323)
> 07F3 DIACRITIQUE N’KO TRÉMA
> * permet d’étendre le répertoire pour représenter des sons arabes ou français ou d’autres langues
> x (diacritique tréma - 0308)
> ;0
> @ Lettres de ton
> 07F4 APOSTROPHE N’KO DE TON HAUT
> x (lettre modificative apostrophe - 02BC)
> 07F5 APOSTROPHE N’KO DE TON BAS
> x (lettre modificative virgule tournée - 02BB)

Best regards,
Marcel

Date/Time: Sun Apr 30 08:37:40 CDT 2017
Name: Marcel Schneider
Report Type: Error Report
Opt Subject: Armenian Punctuations

Armenian punctuations are actually subheaded with a diacritic.
gc show. Would be useful to split up?

@		Lettre modificative
0559	DEMI-ROND GAUCHE ARMÉNIEN
	x (lettre modificative virgule réfléchie - 02BD)
	x (lettre modificative demi-rond gauche - 02BF)
	x (diacritique virgule réfléchie en chef - 0314)
;0
@		Ponctuations
055A	APOSTROPHE ARMÉNIENNE
[…]

Regards,
Marcel

Date/Time: Sun Apr 30 08:41:19 CDT 2017
Name: Marcel Schneider
Report Type: Error Report
Opt Subject: Annotations to characters that may be confused with a diacritic.

« This is an independant character », not « This is a spacing character ». The resaon 
is that part of the combining characters are spacing (those of Gc=Mc).

E.g.: 005E	CIRCUMFLEX ACCENT
	* this is a spacing character

Date/Time: Sun Apr 30 08:46:13 CDT 2017
Name: Marcel Schneider
Report Type: Error Report
Opt Subject: Ògham Space Mark

Actually we have:
@		Punctuation
1680	OGHAM SPACE MARK
	* glyph is blank in "stemless" style fonts
	x (space - 0020)

This is Gc=Zs, so the subhead should be Space.

Date/Time: Sun Apr 30 08:48:46 CDT 2017
Name: Marcel Schneider
Report Type: Error Report
Opt Subject: Numbering system in TUS (reminder); Names of Super/Sub script digits

With respect to TUS being a standard, its low resolution numbering system is
inappropriate, as it prevents users from citing loci cross-version. Citing the
whole subhead is clumsy in many circumstances. The side pane of Adobe Reader
shows bookmarks, as does Firefox, but has no context menu to grasp the
bookmark. Opera and Chrome do not have side panes.

Superscript digits in Latin-1 are neither symbols nor punctuation. NBSP is
neither. More subheads are required. Given the detailed subheads in the
further part of Latin-1, subheads are lacking by design in the first part to
bury the superscript digits under a mass of unrelated characters, to avoid
make them stand out.

Names of superscript and subscript digits are defective, because "digit" is
missing, as opposed to all other generic categories (sign, letter, and so on).
Every character except most letters and all digits are built from
"superscript" or "subscript" followed by the plain name. Additional
informative aliases are needed to address this inconsistency.

Date/Time: Sun Apr 30 08:50:05 CDT 2017
Name: Marcel Schneider
Report Type: Error Report
Opt Subject: U+1F6EC

1F6EC glyph is wrong; landing airplanes are horizontal, as seen in airport signage.

Date/Time: Sun Apr 30 08:51:05 CDT 2017
Name: Marcel Schneider
Report Type: Error Report
Opt Subject: U+1F786

Here is an alias that seems to be displaced:

1F785	MEDIUM BOLD WHITE CIRCLE
	x (medium white circle - 26AA)
1F786	BOLD WHITE CIRCLE
	= very heavy circle
@@@ seems wrong here (displaced)
1F787	HEAVY WHITE CIRCLE
1F788	VERY HEAVY WHITE CIRCLE
1F789	EXTREMELY HEAVY WHITE CIRCLE

Date/Time: Sun Apr 30 08:53:41 CDT 2017
Name: Richard Wordingham
Report Type: Public Review Issue
Opt Subject: Unicode 10.0.0 Beta - InPC for Newa Vowels

I have doubts about the Indic_Positional_Category (InPC) values for four of
the dependent vowels of the Newa script.  On examining the vowel chart (p1265
of http://www.unicode.org/Public/10.0.0/charts/CodeCharts.pdf) one may feel
quite comfortable with assigning the property values:

1143E..1143F  ; Top # Mn   [2] NEWA VOWEL SIGN E..NEWA VOWEL SIGN AI

11440..11441  ; Right # Mc   [2] NEWA VOWEL SIGN O..NEWA VOWEL SIGN AU

However, on consulting Section 3.6 of Anshuman Pandey's 'Proposal to Encode
the Newar Script in ISO/IEC 10646' http://www.unicode.org/L2/L2012/12003r-
newar.pdf, one finds that after the seven headless consonants GA, NYA, TTHA,
NNA, THA, DHA and SHA, the dependent vowels take forms more appropriate to the
property values

1143E  ; Left         # Mn   NEWA VOWEL SIGN E
1143F  ; Top_and_Left # Mn   NEWA VOWEL SIGN AI
11440..11441  ; Left_and_Right # Mc   [2] NEWA VOWEL SIGN O..NEWA VOWEL
SIGN AU

Now, I have no idea what the effect of a right-to-left directional override
should be on the combining marks, but in general I believe gc=Mn makes more
sense for U+1143E and U+1143F, so I am not challenging that property
assignment.

However, I do wonder what the best property values are for a renderer, such as
Microsoft's Universal Shaping Engine.  It seems to me that it may be better to
start with the properties involving 'Left' and use contextual substitutions to
convert the dependent vowels to components of the correct position.  For a
visually full consonant for which 'Top' or 'Right' is the appropriate
placement of the vowel, the default form of the consonant's glyph will not be
appropriate.

Unfortunately, I have not been able to make time to experiment with fonts.

Date/Time: Sun Apr 30 08:54:17 CDT 2017
Name: Marcel Schneider
Report Type: Error Report
Opt Subject: Subheader for U+1F6E0

The second subheader below would seem to be unnecessary and thus could be suppressed:

@		Signage and other symbols
1F6D0	PLACE OF WORSHIP
1F6D1	OCTAGONAL SIGN
	= stop sign
	* may contain text indicating stop
	x (warning sign - 26A0)
	x (heavy white down-pointing triangle - 26DB)
1F6D2	SHOPPING TROLLEY
	= shopping cart
@		Miscellaneous symbols
1F6E0	HAMMER AND WRENCH

Date/Time: Sun Apr 30 08:56:04 CDT 2017
Name: Marcel Schneider
Report Type: Error Report
Opt Subject: Crossreferences for U+1F46B and 1F6BB

Iʼd suggest to remove the xrefs here:

1F46B	MAN AND WOMAN HOLDING HANDS
	x (restroom - 1F6BB)
@@@@ought to remove xrefs here
1F6BB	RESTROOM
	= man and woman symbol with divider
	= unisex restroom
	x (man and woman holding hands - 1F46B)

Date/Time: Sun Apr 30 08:59:12 CDT 2017
Name: Marcel Schneider
Report Type: Error Report
Opt Subject: Names, glyphs, usage and annotations of Mountain Cable Transit Emoji

There are some problems showing up when looking at these emoji:

1F6A0	MOUNTAIN CABLEWAY
1F6A1	AERIAL TRAMWAY
1F6A0	TÉLÉPHÉRIQUE
	= tramway aérien
	* deux grandes cabines (navettes)
	; <https://en.wikipedia.org/wiki/Aerial_tramway#Terminology>, <http://www.iemoji.com/view/emoji/861/travel-places/mountain-cableway>
1F6A1	TÉLÉCABINE
	* nombreuses petites gondoles (circulation)
	* parfois incorrectement appelé « aerial tramway »
	; <https://en.wikipedia.org/wiki/Gondola_lift>, <http://www.iemoji.com/view/emoji/862/travel-places/aerial-tramway>
@@@@@ should be annotated because 1F6A1 is a misnomer, and glyph fits 1F6A0, while the latter does not really exist, as a cable this way is technically unfeasible.

Date/Time: Sun Apr 30 09:02:28 CDT 2017
Name: Marcel Schneider
Report Type: Error Report
Opt Subject: Rocket ornaments

These are closer to weapons than to space rockets. Therefore alleging any use
should be avoided. Instead, pointing to the real ROCKET pictograph would be a
more straigtforward disclaimer than the actual one, that reads like an
innocent red herring:

@		Rocket ornaments
@+		The rocket ornaments function similarly to fleurons for text decoration, 
and are not intended as pictographs for spaceships.
1F66C	LEFTWARDS ROCKET
	x (rocket - 1F680)
1F66D	UPWARDS ROCKET
1F66E	RIGHTWARDS ROCKET
1F66F	DOWNWARDS ROCKET

I do suggest this presesntation
(sorry for lack of translation):

@		Fusées
@+		Ces ornements ont un rôle décoratif similaire à celui des vignettes. 
Un pictogramme de fusée est codé au point 1F680 du bloc Symboles cartographiques et de transports (1F680..1F6FF).
1F66C	FUSÉE VERS LA GAUCHE
[…]

Date/Time: Sun Apr 30 09:06:27 CDT 2017
Name: Marcel Schneider
Report Type: Error Report
Opt Subject: Fleurons

I suggest adding subheaders and annotations to fleurons. 
Actually we have:

@@	1F650	Ornamental Dingbats	1F67F
@		Fleurons
@+		Fleurons are leaf or floral-shaped ornaments used for text decoration.
1F650	NORTH WEST POINTING LEAF
1F651	SOUTH WEST POINTING LEAF
1F652	NORTH EAST POINTING LEAF
1F653	SOUTH EAST POINTING LEAF
1F654	TURNED NORTH WEST POINTING LEAF
1F655	TURNED SOUTH WEST POINTING LEAF
1F656	TURNED NORTH EAST POINTING LEAF
1F657	TURNED SOUTH EAST POINTING LEAF
1F658	NORTH WEST POINTING VINE LEAF
1F659	SOUTH WEST POINTING VINE LEAF
	x (reversed rotated floral heart bullet - 2619)
1F65A	NORTH EAST POINTING VINE LEAF
1F65B	SOUTH EAST POINTING VINE LEAF
	x (rotated floral heart bullet - 2767)
1F65C	HEAVY NORTH WEST POINTING VINE LEAF
1F65D	HEAVY SOUTH WEST POINTING VINE LEAF
1F65E	HEAVY NORTH EAST POINTING VINE LEAF
1F65F	HEAVY SOUTH EAST POINTING VINE LEAF
1F660	NORTH WEST POINTING BUD
1F661	SOUTH WEST POINTING BUD
1F662	NORTH EAST POINTING BUD
1F663	SOUTH EAST POINTING BUD
1F664	HEAVY NORTH WEST POINTING BUD
1F665	HEAVY SOUTH WEST POINTING BUD
1F666	HEAVY NORTH EAST POINTING BUD
1F667	HEAVY SOUTH EAST POINTING BUD

Suggested scheme something like this:
@@	1F650	Casseau ornemental	1F67F
@+		Ce bloc complète le bloc Casseau ou Dingbats (2700..27BF). §22.9
@+		Les trois premiers types d’ornements sont classés dans la catégorie 
des vignettes. En typographie, les vignettes sont des ornements végétaux (à l’origine, 
des ceps, pampres et feuilles de vigne) pour décorer le texte.
;0
@		Feuilles d’arbre
1F650	FEUILLE NORD-OUEST
1F651	FEUILLE SUD-OUEST
1F652	FEUILLE NORD-EST
1F653	FEUILLE SUD-EST
1F654	FEUILLE INVERSÉE NORD-OUEST
	* 1F651 inversée, ou 1F652 réfléchie
1F655	FEUILLE INVERSÉE SUD-OUEST
	* 1F650 inversée, ou 1F653 réfléchie
1F656	FEUILLE INVERSÉE NORD-EST
	* 1F653 inversée, ou 1F650 réfléchie
1F657	FEUILLE INVERSÉE SUD-EST
	* 1F652 inversée, ou 1F651 réfléchie
;0
@		Feuilles de vigne
1F658	FEUILLE DE VIGNE NORD-OUEST
1F659	FEUILLE DE VIGNE SUD-OUEST
	x (cœur floral couché à droite - 2619)
1F65A	FEUILLE DE VIGNE NORD-EST
1F65B	FEUILLE DE VIGNE SUD-EST
	x (cœur floral couché - 2767)
1F65C	FEUILLE DE VIGNE NORD-OUEST À TRAIT FORT
1F65D	FEUILLE DE VIGNE SUD-OUEST À TRAIT FORT
1F65E	FEUILLE DE VIGNE NORD-EST À TRAIT FORT
1F65F	FEUILLE DE VIGNE SUD-EST À TRAIT FORT
;0
@		Bourgeons
1F660	BOURGEON NORD-OUEST
1F661	BOURGEON SUD-OUEST
1F662	BOURGEON NORD-EST
1F663	BOURGEON SUD-EST
1F664	BOURGEON NORD-OUEST À TRAIT FORT
1F665	BOURGEON SUD-OUEST À TRAIT FORT
1F666	BOURGEON NORD-EST À TRAIT FORT
1F667	BOURGEON SUD-EST À TRAIT FORT

Date/Time: Sun Apr 30 09:09:06 CDT 2017
Name: Marcel Schneider
Report Type: Error Report
Opt Subject: U+1F41B Sample Glyph

U+1F41B BUG: The sample glyph is inconsistent with the character name. Should 
be an animal of the order of the hemiptera. Kind of a beetle.

Date/Time: Sun Apr 30 09:11:36 CDT 2017
Name: Marcel Schneider
Report Type: Error Report
Opt Subject: The zodiacal signs

The zodiacal signs seem to be unequally represented. Iʼd suggest more precise
subheader for occidental signs, and crossreferece the two systems:

@		Signes du zodiaque occidental
@+		Les animaux du zodiaque asiatique sont codés dans la plage 1F400..1F418 du bloc Symboles et pictogrammes (1F300..1F5FF).
@		Animaux du zodiaque asiatique
@+		Les signes du zodiaque occidental sont codés dans la plage 2648..2653 du bloc Symboles divers (2600..26FF).
1F400	RAT
	= premier signe du zodiaque asiatique
1F401	SOURIS
	= premier signe du zodiaque asiatique en Perse
1F402	BŒUF
	= deuxième signe du zodiaque asiatique
1F403	BUFFLE DES INDES
	= deuxième signe du zodiaque asiatique au Vietnam
1F404	VACHE
	= deuxième signe du zodiaque asiatique en Perse
	= bœuf
@		Signe du zodiaque occidental
26CE	SERPENTAIRE
	* treizième signe, entre le scorpion et le sagittaire ; habituellement ignoré dans le zodiaque

Date/Time: Sun Apr 30 09:15:40 CDT 2017
Name: Marcel Schneider
Report Type: Error Report
Opt Subject: U+1F398

@		Musical symbols
1F398	MUSICAL KEYBOARD WITH JACKS
	= midi, midi keyboard
	x (musical keyboard - 1F3B9)

This is primarily referred to as a "master keyboard", so this word should show
up somewhere. BTW I believed that the connectors shown in the glyph are plugs,
not jacks, but I can be wrong because I had not time to check real devices.

However Iʼd suggest change in alias:

@		Musical symbols
1F398	MUSICAL KEYBOARD WITH JACKS
	= master keyboard, midi
	x (musical keyboard - 1F3B9)

Date/Time: Sun Apr 30 09:18:29 CDT 2017
Name: Marcel Schneider
Report Type: Error Report
Opt Subject: Squared hiragana from ARIB STD B24

Flaw between "square" and "squared":
@@	1F200	Enclosed Ideographic Supplement	1F2FF
@		Squared hiragana from ARIB STD B24
1F200	SQUARE HIRAGANA HOKA
	= and others
	# <square> 307B 304B
@		Squared katakana
1F201	SQUARED KATAKANA KOKO
	= here sign
	# <square> 30B3 30B3

Some suggestions:

@@	1F200	Idéogrammes entourés complémentaires	1F2FF
@+		Ce bloc complète les blocs Lettres et mois CJC entourés (3200..32FF) et Compatibilité CJC (3300..33FF). §22.10
@		Hiragana encadré
1F200	HIRAGANA HOKA DISPOSÉ EN CARRÉ
	= et autres
	* tiré d’ARIB STD B24
	# <carré> 307B 304B
@		Katakanas encadrés
1F201	KATAKANA KOKO ENCADRÉ
	= symbole ici
	# <carré> 30B3 30B3

Date/Time: Sun Apr 30 09:20:19 CDT 2017
Name: Marcel Schneider
Report Type: Error Report
Opt Subject: U+1F30A

Proposed alias:
1F30A	WATER WAVE
	= tsunami

Date/Time: Sun Apr 30 09:21:53 CDT 2017
Name: Marcel Schneider
Report Type: Error Report
Opt Subject: Domino Tiles

@@	1F030	Domino Tiles	1F09F
The Domino tile subheads are almost all wrong. Dominoes are named following the 
least value. True subheads would be: "Tiles with zero dots on the left side", and 
so on. Or this way:
@		Horizontal tiles
1F030	DOMINO TILE HORIZONTAL BACK
@		Zeroes
1F031	DOMINO TILE HORIZONTAL-00-00
1F032	DOMINO TILE HORIZONTAL-00-01
1F033	DOMINO TILE HORIZONTAL-00-02
1F034	DOMINO TILE HORIZONTAL-00-03
1F035	DOMINO TILE HORIZONTAL-00-04
1F036	DOMINO TILE HORIZONTAL-00-05
1F037	DOMINO TILE HORIZONTAL-00-06
1F038	DOMINO TILE HORIZONTAL-01-00
@		Ones
1F039	DOMINO TILE HORIZONTAL-01-01
1F03A	DOMINO TILE HORIZONTAL-01-02
1F03B	DOMINO TILE HORIZONTAL-01-03
1F03C	DOMINO TILE HORIZONTAL-01-04
1F03D	DOMINO TILE HORIZONTAL-01-05
1F03E	DOMINO TILE HORIZONTAL-01-06
@		Zero
1F03F	DOMINO TILE HORIZONTAL-02-00
@		One
1F040	DOMINO TILE HORIZONTAL-02-01
@		Twos
1F041	DOMINO TILE HORIZONTAL-02-02
1F042	DOMINO TILE HORIZONTAL-02-03
1F043	DOMINO TILE HORIZONTAL-02-04
1F044	DOMINO TILE HORIZONTAL-02-05
1F045	DOMINO TILE HORIZONTAL-02-06
@		Zero
1F046	DOMINO TILE HORIZONTAL-03-00
@		One
1F047	DOMINO TILE HORIZONTAL-03-01
@		Two
1F048	DOMINO TILE HORIZONTAL-03-02
@		Threes
1F049	DOMINO TILE HORIZONTAL-03-03
1F04A	DOMINO TILE HORIZONTAL-03-04
1F04B	DOMINO TILE HORIZONTAL-03-05
1F04C	DOMINO TILE HORIZONTAL-03-06
@		Zero
1F04D	DOMINO TILE HORIZONTAL-04-00
@		One
1F04E	DOMINO TILE HORIZONTAL-04-01
@		Two
1F04F	DOMINO TILE HORIZONTAL-04-02
@		Three
1F050	DOMINO TILE HORIZONTAL-04-03
@		Fours
1F051	DOMINO TILE HORIZONTAL-04-04
1F052	DOMINO TILE HORIZONTAL-04-05
1F053	DOMINO TILE HORIZONTAL-04-06
@		Zero
1F054	DOMINO TILE HORIZONTAL-05-00
@		One
1F055	DOMINO TILE HORIZONTAL-05-01
@		Two
1F056	DOMINO TILE HORIZONTAL-05-02
@		Three
1F057	DOMINO TILE HORIZONTAL-05-03
@		Four
1F058	DOMINO TILE HORIZONTAL-05-04
@		Fives
1F059	DOMINO TILE HORIZONTAL-05-05
1F05A	DOMINO TILE HORIZONTAL-05-06
@		Zero
1F05B	DOMINO TILE HORIZONTAL-06-00
@		One
1F05C	DOMINO TILE HORIZONTAL-06-01
@		Two
1F05D	DOMINO TILE HORIZONTAL-06-02
@		Three
1F05E	DOMINO TILE HORIZONTAL-06-03
@		Four
1F05F	DOMINO TILE HORIZONTAL-06-04
@		Five
1F060	DOMINO TILE HORIZONTAL-06-05
@		Six
1F061	DOMINO TILE HORIZONTAL-06-06
It would be useful to add a block annotation:
@@	1F030	Dominos	1F09F
@+		Ce bloc prend en charge les pièces du jeu double-six, posées dans les 
quatre sens possibles. Cela nécessite non 28, mais 98 caractères et deux pour le dos 
de domino. §22.9
@		Dominos horizontaux
@+		Les points sont notés sur deux chiffres dans l’anticipation de l’éventuel 
codage du jeu double-douze ou supérieur.
1F030	DOS DE DOMINO HORIZONTAL
@		Dominos avec zéro point à gauche
1F031	DOMINO HORIZONTAL-00-00
[…]

Kind regards,

Marcel

Date/Time: Sun Apr 30 09:22:46 CDT 2017
Name: Marcel Schneider
Report Type: Error Report
Opt Subject: U+1BC43

1BC43	DUPLOYAN LETTER OA
	* Pernin aw
	* Perrault aw
could be merged to:
	* Pernin, Perrault: aw
(adding a colon after the variant identifiers).

Date/Time: Sun Apr 30 09:24:21 CDT 2017
Name: Marcel Schneider
Report Type: Error Report
Opt Subject: Derived verb in Duployan annotations

1BC62	LETTRE DE STÉNO DUPLOYÉ O NASAL
	* secondary orientating; invariant direction upwards
and elsewhere, 18 times in the Duployan block:
Exact word seems to be "orienting", not "orientating":
http://www.unicode.org/L2/L2010/10272r2-duployan.pdf

Date/Time: Sun Apr 30 09:26:31 CDT 2017
Name: Marcel Schneider
Report Type: Error Report
Opt Subject: Names List Syntax: missing @+?

0197	LATIN CAPITAL LETTER I WITH STROKE
	= barred i, i bar
	* African
	* lowercase is 0268
	* ISO 6438 gives lowercase as 026A, not 0268
	x (latin letter small capital i - 026A)
*** Is there a leading @+ missing on line 5? ***

Date/Time: Sun Apr 30 09:30:25 CDT 2017
Name: Marcel Schneider
Report Type: Error Report
Opt Subject: Suggestions for TUS Mende Kikakui section

Suggestions for TUS Mende Kikakui section

TUS §19.8, Mende Kikakui, and Code Charts: Add some hints about Mansaray, not
cited in the Reference list. Record that the encoding follows Tuchscherer, and
adds differences both in Dalby and Mansaray. Add paragraph in the section
about the scholars and the diverging corpora (corpuses). Not split the table
in TUS (thereʼs room below). Add annotation to the blockhead in the Code
Charts too.

Date/Time: Sun Apr 30 09:31:37 CDT 2017
Name: Marcel Schneider
Report Type: Error Report
Opt Subject: TUS chapter 7 typography

TUS: add comma in §7.1 on page 305:
“Whether to use a Latin ligature*,* is a matter of typographical style as well 
as a result of the orthographical rules of the language.”

Date/Time: Sun Apr 30 09:34:27 CDT 2017
Name: Marcel Schneider
Report Type: Error Report
Opt Subject: annotations to U+0133

Useful to add some annotations to U+0133?
Some suggestion:
0133	LETTRE MINUSCULE LATINE IJ
	= digramme soudé minuscule ij
	* néerlandais
	* la soudure de ce digramme n’apparaît pas dans toutes les polices
	* avec accent aigu, utiliser la suite 0133 0301 DIACRITIQUE ACCENT AIGU ; les 
polices conformes affichent un accent aigu sur l’i et un deuxième accent aigu sur le j
	# 0069 006A

Date/Time: Sun Apr 30 09:39:33 CDT 2017
Name: Marcel Schneider
Report Type: Error Report
Opt Subject: Thai blockheader syntax

Is here a typo in NamesList markup?

@@	0E00	Thai	0E7F
@@+
@		Based on TIS 620-2533
@		Consonants
0E01	THAI CHARACTER KO KAI

should probably be:

@@	0E00	Thai	0E7F
@@+
@+		Based on TIS 620-2533.
@		Consonants
0E01	THAI CHARACTER KO KAI

Date/Time: Sun Apr 30 09:43:04 CDT 2017
Name: Marcel Schneider
Report Type: Error Report
Opt Subject: Syriac Block Start and general NamesList featuring

The Names List in the Code Charts should point directly to the relevant
sections in TUS by including numbers such as “§23.2” that could even be
hyperlinked. Using the Index every time is unwieldy.  Each blockhead should
contain at least one section number, and others may be provided in the
annotations. I believe that it is this unwieldiness of the Code Charts that is
responsible of a noticeable part of the lack of Unicode training. That can be
fixed by adding more information in the NamesList. Another important feature
is the expansion of the numbering from two levels to three levels. These
numbers should be placed into the margin of the core specifications.

Here is an example with also some change in content:

@@	0700	Syriac	074F
@		Syriac punctuation and signs
0700	SYRIAC END OF PARAGRAPH
	* marks the end of a paragraph
0701	SYRIAC SUPRALINEAR FULL STOP

becomes:

@@	0700	Syriaque	074F
@+		§9.3
@		Ponctuation
0700	FIN DE PARAGRAPHE SYRIAQUE
0701	POINT SYRIAQUE SUPRALINÉAIRE

Date/Time: Sun Apr 30 09:44:19 CDT 2017
Name: Marcel Schneider
Report Type: Error Report
Opt Subject: TUS typo

TUS p. 648 typo paragraph 4 line 3 « dreived » for « derived ».

Date/Time: Sun Apr 30 09:59:44 CDT 2017
Name: Marcel Schneider
Report Type: Error Report
Opt Subject: Relaxing Restrictions on Superscript Letters

Iʼd like to inform you that in French version Iʼm supporting name changes for superscript letters:

@@	02B0	Lettres modificatives espaçantes	02FF
@		Lettres modificatives latines en exposant
@+		Ces caractères présentent l’attribut « exposant » en texte brut, pour les systèmes 
de notation, au prix de limitations pratiques en matière de polices et de style. Les noms en « lettre 
modificative », selon un terme technique de phonétique, expriment cette réserve, au prix d’incohérences 
avec les noms des lettres en indice et des deux lettres ci-après. Le standard spécifie que les œils de 
toutes ces lettres doivent être homogènes (§7.8 et §22.4).
		x (exposant lettre minuscule latine i - 2071)
		x (exposant lettre minuscule latine n - 207F)
02B0	EXPOSANT LETTRE MINUSCULE LATINE H
	= lettre modificative minuscule h
	* aspiré
	# <exposant> 0068
	
For Unicode I suggest adding informative aliases, as announced on the Public Mailing List on 
Tue, 17 Jan 2017 09:25:46 +0100 (CET):
http://www.unicode.org/mail-arch/unicode-ml/y2017-m01/0093.html

Quoting myself:

To date, as far as I know, the only domain where superscripts and subscripts are 
mandatory in general text are abbreviations of numerals, titles, entities, 
measurement units, chemical compounds and so on, using Western Arabic digits and 
Latin superscript lowercase. Iʼm quite sure that no other scripts do have this 
typographical convention, that is a part of an old discipline called 
“orthotypography.” While I was wrong mixing it up with orthography, the outstanding 
importance of these rules for unambiguous representation of text calls for special 
treatment in practice and in the Unicode Standard. 
In these ranges, one character is still missing because the UTC has refused to 
encode *LATIN SUPERSCRIPT SMALL LETTER Q, aka *MODIFIER LETTER SMALL Q. 
This has little incidence on general practice. 
The main challenge outside Unicode is the availability of the related glyphs in 
current fonts, as well as their consistency. To date, almost all webmails propose 
only fonts where they are designed in an intentionally inconsistent way, supposedly 
to make them unusable for accurate display: The 'ⁱ' is always far too high, and 
the 'ⁿ' is too bold and with random vertical alignment. In my opinion, the legacy 
status of these two is used as a fake explanation; compare with the inconsistent 
design of '⁶' and '⁹' in some fonts, along with that of '⁰', while there is no 
excuse of “legacy,” unlike for '¹', '²' and '³', where “legacy” is equally abused 
to mess up the typefaces. This applies as well to most other fonts. The only 
correct font-family Iʼve found so far is Calibri. Consistently, this is the body 
font in the default template of Microsoft Word. 

Iʼm facing this issue when writing drafts in my text editor, where however Iʼm able 
to set the font to any value, including Calibri. Displaying this in Calibri allows 
to appreciate the consistent and running-text-like display of the superscripts: 
// This is ᵒʳᵈⁱⁿᵃʳʸ ᵗᵉˣᵗ ˢᵉᵗ ⁱⁿ ᵁⁿⁱᶜᵒᵈᵉ ᴸᵃᵗⁱⁿ ˢᵘᵖᵉʳˢᶜʳⁱᵖᵗ ˢᵐᵃˡˡ ˡᵉᵗᵗᵉʳˢ ᵃⁿᵈ ᵗʷᵒ ᶜᵃᵖⁱᵗᵃˡ ˡᵉᵗᵗᵉʳˢ 
// This is the range: ᵃᵇᶜᵈᵉᶠᵍʰⁱʲᵏˡᵐⁿᵒᵖ ^q_unavailableʳˢᵗᵘᵛʷˣʸᶻ¹²³⁴⁵⁶⁷⁸⁹⁰₁₂₃₄₅₆₇₈₉₀ 
This is how a complete and Unicode conformant typeface is supposed to work. 
In practice, this turns out to be implemented far, far more than U+2044. 

The goal is to get Unicode accept the fact that people use superscript letters 
in French, and super/sub scripts in vulgar fractions, and have them on their 
keyboards, and that these people are not considered as hackers, but as making 
a reasonable, thoughtful and responsive use of the Standard. 
That is not a matter of “value inversion,” but of correcting a particular 
design principle that was misled and biased under a (hypothetically) strong 
influence of *extrinsic* factors from the beginning on (see point 3 below). 
Itʼs good to know about the counter-arguments that may be figured out, so Iʼm 
grateful to all who were so kind to respond. What bothers me, is that there is 
still so much persistent opposition; and what makes me fear the worse, is that 
the arguments raised against the general use of preformatted characters are 
so biased and fallacious, unlike any normal-time reasoning: 
1) Missing font support as an argument against the use of a character has never, 
    never been the way Unicode worked, so far as Iʼve been given the opportunity 
    to understand something of Unicode till now. 
2) This missing font support is mostly a consequence of the Unicode strategy on 
    these characters: Discouraging their use and even misnaming them intentionally 
    in an inconsistent manner (from an overall point of view), Unicode drove 
    a significant part of the font designers away from adding them completely and 
    with a consistent design, and from implementing combining marks support for 
    these characters. 
3) This strategy is biased from the beginning on, as it goes against the user 
    preferences of Latin script using countries, while AFAIK all countries 
    using other scripts are unconcerned because they actually donʼt *use* 
    superscripting in such an *extensive* way. Please correct me If Iʼm wrong. 
    Consequently, there would be *nobody* asking for more (except the already 
    discussed completion of some ranges of Latin script). This strategy of shooing 
    users (and their developers) away from using preformatted letters and digits 
    seems to aim nothing serious than support of software vendorsʼ marketing 
    strategies, despite of the software not needing poor character support based 
    (and poor keyboard layout based) marketing. 


The core issue is the use of these letters in current text in some languages 
that need them to apply a typographic convention that is close to orthography. 
Superscripting is a far, far stronger requirement than all other formatting 
conventions, as it can affect the spelling of the grammatical entity. 
Weʼre facing strong demands on user side relayed by standards bodies from the 
early times on, when ordinal indicators were first encoded as a part of Latin-1. 
Today most users still type a degree sign to emulate a superscript o, and the 
French NB (that Iʼm not a part of, nor am y a member of the keyboard standard WG) 
wishes an ordinal indicator on the keyboard to represent the most common ordinal 
indicator in French: "ᵉ". 

Additionally, I now suggest to add an informative alias to each one of the 
(intentionally) misnamed characters. This “MODIFIER LETTER” disguise of the true 
*LATIN SUPERSCRIPT LETTERs seems to me a twisted trick to make inadvertant people 
believe that hereʼs a thing to insiders that is completely useless to other people. 
The truth happens to show up wherever the editorial committee (as well as 
anybody else) can afford to feel free to write their own, unbiased language: 
[Iʼm highlighting with uppercase] 
@ Latin superscript modifier letters 
@+ See also SUPERSCRIPT LATIN LETTERS in the Spacing Modifier Letters block starting at 02B0. 
1D2C MODIFIER LETTER CAPITAL A 
... 
I think that the "MODIFIER LETTER" labeling of these characters is not 
straightforward enough for a standard who claims that the character names are 
mere identifiers. This is an example of how the identifiers were (ab)used as 
descriptors, to carry prescriptions and corporate preferences on how to use or 
not to use the repertoire. 
When Iʼm back writing up some keyboard documentation, I really would like to 
be able to deliver a better image of Unicode – and of Microsoft – than that one. 
Please help me improve my communication, and make Unicode a user-centered standard. 
Below are the proposed additions, that Iʼd like to submit to your kind review 
prior to posting them with the Contact Form. 
Regards, 
Marcel 
NamesList snippets with additional informative aliases providing straightforward 
character identifiers, and some comment lines: 
(Original file: 
http://www.unicode.org/Public/UCD/latest/ucd/NamesList.txt 
) 
@@ 02B0 Spacing Modifier Letters 02FF 
@+ Superscript and subscript letters were not intended to replace markup, but they are 
for use where super/sub scripting is important in 
plain text, or formatting is inappropriate. 
@ Latin superscript modifier letters 
@+ "modifier letter small" stands for "latin superscript small letter", and "modifier letter 
small capital" for "latin letter small capital". 
x (superscript latin small letter i - 2071) 
x (superscript latin small letter n - 207F) 
02B0 MODIFIER LETTER SMALL H 
= latin superscript small letter h 
* aspiration 
# <super> 0068 
02B1 MODIFIER LETTER SMALL H WITH HOOK 
= latin superscript small letter h with hook 
* breathy voiced, murmured 
x (latin small letter h with hook - 0266) 
x (combining diaeresis below - 0324) 
# <super> 0266 
02B2 MODIFIER LETTER SMALL J 
= latin superscript small letter j 
* palatalization 
x (combining palatalized hook below - 0321) 
# <super> 006A 
02B3 MODIFIER LETTER SMALL R 
= latin superscript small letter r 
# <super> 0072 
02B4 MODIFIER LETTER SMALL TURNED R 
= latin superscript small letter turned r 
x (latin small letter turned r - 0279) 
# <super> 0279 
02B5 MODIFIER LETTER SMALL TURNED R WITH HOOK 
= latin superscript small letter turned r with hook 
x (latin small letter turned r with hook - 027B) 
# <super> 027B 
02B6 MODIFIER LETTER SMALL CAPITAL INVERTED R 
= latin letter small capital inverted r 
* preceding four used for r-coloring or r-offglides 
x (latin letter small capital inverted r - 0281) 
# <super> 0281 
02B7 MODIFIER LETTER SMALL W 
= latin superscript small letter w 
* labialization 
x (combining inverted double arch below - 032B) 
# <super> 0077 
02B8 MODIFIER LETTER SMALL Y 
= latin superscript small letter y 
* palatalization 
* common Americanist usage for 02B2 
# <super> 0079 
[…] 
@ Additions based on 1989 IPA 
02DE MODIFIER LETTER RHOTIC HOOK 
* rhotacization in vowel 
* often ligated: 025A = 0259 + 02DE; 025D = 025C + 02DE 
02DF MODIFIER LETTER CROSS ACCENT 
* Swedish grave accent 
02E0 MODIFIER LETTER SMALL GAMMA 
= latin superscript small letter gamma 
* these modifier letters are occasionally used in transcription of affricates 
# <super> 0263 
02E1 MODIFIER LETTER SMALL L 
= latin superscript small letter l 
# <super> 006C 
02E2 MODIFIER LETTER SMALL S 
= latin superscript small letter s 
# <super> 0073 
02E3 MODIFIER LETTER SMALL X 
= latin superscript small letter x 
# <super> 0078 
02E4 MODIFIER LETTER SMALL REVERSED GLOTTAL STOP 
= latin superscript letter reversed glottal stop 
# <super> 0295 
[…] 
@ Latin superscript modifier letters 
@+ See also superscript Latin letters in the Spacing Modifier Letters block starting at 02B0. 
1D2C MODIFIER LETTER CAPITAL A 
= latin superscript capital letter a 
# <super> 0041 
1D2D MODIFIER LETTER CAPITAL AE 
= latin superscript capital letter ae 
# <super> 00C6 
1D2E MODIFIER LETTER CAPITAL B 
= latin superscript capital letter b 
# <super> 0042 
1D2F MODIFIER LETTER CAPITAL BARRED B 
= latin superscript capital letter barred b 
1D30 MODIFIER LETTER CAPITAL D 
= latin superscript capital letter d 
# <super> 0044 
1D31 MODIFIER LETTER CAPITAL E 
= latin superscript capital letter e 
# <super> 0045 
1D32 MODIFIER LETTER CAPITAL REVERSED E 
= latin superscript capital letter reversed e 
# <super> 018E 
1D33 MODIFIER LETTER CAPITAL G 
= latin superscript capital letter g 
# <super> 0047 
1D34 MODIFIER LETTER CAPITAL H 
= latin superscript capital letter h 
# <super> 0048 
1D35 MODIFIER LETTER CAPITAL I 
= latin superscript capital letter i 
# <super> 0049 
1D36 MODIFIER LETTER CAPITAL J 
= latin superscript capital letter j 
# <super> 004A 
1D37 MODIFIER LETTER CAPITAL K 
= latin superscript capital letter k 
# <super> 004B 
1D38 MODIFIER LETTER CAPITAL L 
= latin superscript capital letter l 
# <super> 004C 
1D39 MODIFIER LETTER CAPITAL M 
= latin superscript capital letter m 
# <super> 004D 
1D3A MODIFIER LETTER CAPITAL N 
= latin superscript capital letter n 
# <super> 004E 
1D3B MODIFIER LETTER CAPITAL REVERSED N 
= latin superscript capital letter reversed n 
1D3C MODIFIER LETTER CAPITAL O 
= latin superscript capital letter o 
# <super> 004F 
1D3D MODIFIER LETTER CAPITAL OU 
= latin superscript capital letter ou 
# <super> 0222 
1D3E MODIFIER LETTER CAPITAL P 
= latin superscript capital letter p 
# <super> 0050 
1D3F MODIFIER LETTER CAPITAL R 
= latin superscript capital letter r 
# <super> 0052 
1D40 MODIFIER LETTER CAPITAL T 
= latin superscript capital letter t 
# <super> 0054 
1D41 MODIFIER LETTER CAPITAL U 
= latin superscript capital letter u 
# <super> 0055 
1D42 MODIFIER LETTER CAPITAL W 
= latin superscript capital letter w 
# <super> 0057 
1D43 MODIFIER LETTER SMALL A 
= latin superscript small letter a 
# <super> 0061 
1D44 MODIFIER LETTER SMALL TURNED A 
= latin superscript small letter turned a 
# <super> 0250 
1D45 MODIFIER LETTER SMALL ALPHA 
= latin superscript small letter alpha 
# <super> 0251 
1D46 MODIFIER LETTER SMALL TURNED AE 
= latin superscript small letter turned ae 
# <super> 1D02 
1D47 MODIFIER LETTER SMALL B 
= latin superscript small letter b 
# <super> 0062 
1D48 MODIFIER LETTER SMALL D 
= latin superscript small letter d 
# <super> 0064 
1D49 MODIFIER LETTER SMALL E 
= latin superscript small letter e 
# <super> 0065 
1D4A MODIFIER LETTER SMALL SCHWA 
= latin superscript small letter schwa 
# <super> 0259 
1D4B MODIFIER LETTER SMALL OPEN E 
= latin superscript small letter open e 
# <super> 025B 
1D4C MODIFIER LETTER SMALL TURNED OPEN E 
= latin superscript small letter turned open e 
* more appropriate equivalence would be to 1D08 
# <super> 025C 
1D4D MODIFIER LETTER SMALL G 
= latin superscript small letter g 
# <super> 0067 
1D4E MODIFIER LETTER SMALL TURNED I 
= latin superscript small letter i 
1D4F MODIFIER LETTER SMALL K 
= latin superscript small letter k 
# <super> 006B 
1D50 MODIFIER LETTER SMALL M 
= latin superscript small letter m 
# <super> 006D 
1D51 MODIFIER LETTER SMALL ENG 
= latin superscript small letter eng 
# <super> 014B 
1D52 MODIFIER LETTER SMALL O 
= latin superscript small letter o 
# <super> 006F 
1D53 MODIFIER LETTER SMALL OPEN O 
= latin superscript small letter open o 
# <super> 0254 
1D54 MODIFIER LETTER SMALL TOP HALF O 
= latin superscript small letter top half o 
# <super> 1D16 
1D55 MODIFIER LETTER SMALL BOTTOM HALF O 
= latin superscript small letter bottom half o 
# <super> 1D17 
1D56 MODIFIER LETTER SMALL P 
= latin superscript small letter p 
# <super> 0070 
1D57 MODIFIER LETTER SMALL T 
= latin superscript small letter t 
# <super> 0074 
1D58 MODIFIER LETTER SMALL U 
= latin superscript small letter u 
# <super> 0075 
1D59 MODIFIER LETTER SMALL SIDEWAYS U 
= latin superscript small letter sideways u 
# <super> 1D1D 
1D5A MODIFIER LETTER SMALL TURNED M 
= latin superscript small letter turned m 
# <super> 026F 
1D5B MODIFIER LETTER SMALL V 
= latin superscript small letter v 
# <super> 0076 
1D5C MODIFIER LETTER SMALL AIN // (a misnomer also as it should be MODIFIER LETTER AIN; 
cf. 1D25 LATIN LETTER AIN, A724 LATIN CAPITAL 
LETTER EGYPTOLOGICAL AIN, A725 LATIN SMALL LETTER EGYPTOLOGICAL AIN) 
= latin superscript letter ain 
# <super> 1D25 
@ Greek superscript modifier letters 
1D5D MODIFIER LETTER SMALL BETA 
= greek superscript small letter beta 
# <super> 03B2 
1D5E MODIFIER LETTER SMALL GREEK GAMMA 
= greek superscript small letter gamma 
# <super> 03B3 
1D5F MODIFIER LETTER SMALL DELTA // (a misnomer also as it should be MODIFIER LETTER SMALL 
GREEK DELTA, cf. 1E9F LATIN SMALL LETTER 
DELTA) 
= greek superscript small letter delta 
# <super> 03B4 
1D60 MODIFIER LETTER SMALL GREEK PHI 
= greek superscript small letter phi 
# <super> 03C6 
1D61 MODIFIER LETTER SMALL CHI 
= greek superscript small letter chi 
# <super> 03C7 
@ Latin subscript modifier letters 
1D62 LATIN SUBSCRIPT SMALL LETTER I 
# <sub> 0069 
1D63 LATIN SUBSCRIPT SMALL LETTER R 
# <sub> 0072 
1D64 LATIN SUBSCRIPT SMALL LETTER U 
# <sub> 0075 
1D65 LATIN SUBSCRIPT SMALL LETTER V 
# <sub> 0076 
@ Greek subscript modifier letters 
1D66 GREEK SUBSCRIPT SMALL LETTER BETA 
# <sub> 03B2 
1D67 GREEK SUBSCRIPT SMALL LETTER GAMMA 
# <sub> 03B3 
1D68 GREEK SUBSCRIPT SMALL LETTER RHO 
# <sub> 03C1 
1D69 GREEK SUBSCRIPT SMALL LETTER PHI 
# <sub> 03C6 
1D6A GREEK SUBSCRIPT SMALL LETTER CHI 
# <sub> 03C7 
[…] 
@ Modifier letters 
@+ Other modifier letters can be found in the Spacing Modifier Letters, Phonetic Extensions, 
as well as Superscripts and Subscripts blocks. 
1D9B MODIFIER LETTER SMALL TURNED ALPHA 
= latin superscript small letter turned alpha 
# <super> 0252 
1D9C MODIFIER LETTER SMALL C 
= latin superscript small letter c 
# <super> 0063 
1D9D MODIFIER LETTER SMALL C WITH CURL 
= latin superscript small letter c with curl 
# <super> 0255 
1D9E MODIFIER LETTER SMALL ETH 
= latin superscript small letter eth 
# <super> 00F0 
1D9F MODIFIER LETTER SMALL REVERSED OPEN E 
= latin superscript small letter reversed open e 
# <super> 025C 
1DA0 MODIFIER LETTER SMALL F 
= latin superscript small letter f 
# <super> 0066 
1DA1 MODIFIER LETTER SMALL DOTLESS J WITH STROKE 
= latin superscript small letter dotless j with stroke 
# <super> 025F 
1DA2 MODIFIER LETTER SMALL SCRIPT G 
= latin superscript small letter script g 
# <super> 0261 
1DA3 MODIFIER LETTER SMALL TURNED H 
= latin superscript small letter turned h 
# <super> 0265 
1DA4 MODIFIER LETTER SMALL I WITH STROKE 
= latin superscript small letter i with stroke 
# <super> 0268 
1DA5 MODIFIER LETTER SMALL IOTA 
= latin superscript small letter iota 
# <super> 0269 
1DA6 MODIFIER LETTER SMALL CAPITAL I 
= latin letter small capital i 
* not for use in UPA 
x (modifier letter capital i - 1D35) 
# <super> 026A 
1DA7 MODIFIER LETTER SMALL CAPITAL I WITH STROKE 
= latin letter small capital i with stroke 
# <super> 1D7B 
1DA8 MODIFIER LETTER SMALL J WITH CROSSED-TAIL 
= latin superscript small letter j with crossed-tail 
# <super> 029D 
1DA9 MODIFIER LETTER SMALL L WITH RETROFLEX HOOK 
= latin superscript small letter l with retroflex hook 
# <super> 026D 
1DAA MODIFIER LETTER SMALL L WITH PALATAL HOOK 
= latin superscript small letter l with palatal hook 
# <super> 1D85 
1DAB MODIFIER LETTER SMALL CAPITAL L 
= latin letter small capital l 
* not for use in UPA 
x (modifier letter capital l - 1D38) 
# <super> 029F 
1DAC MODIFIER LETTER SMALL M WITH HOOK 
= latin superscript small letter m with hook 
# <super> 0271 
1DAD MODIFIER LETTER SMALL TURNED M WITH LONG LEG 
= latin superscript small letter turned m with long leg 
# <super> 0270 
1DAE MODIFIER LETTER SMALL N WITH LEFT HOOK 
= latin superscript small letter n with left hook 
# <super> 0272 
1DAF MODIFIER LETTER SMALL N WITH RETROFLEX HOOK 
= latin superscript small letter n with retroflex hook 
# <super> 0273 
1DB0 MODIFIER LETTER SMALL CAPITAL N 
= latin letter small capital n 
* not for use in UPA 
x (modifier letter capital n - 1D3A) 
# <super> 0274 
1DB1 MODIFIER LETTER SMALL BARRED O 
= latin superscript small letter barred o 
# <super> 0275 
1DB2 MODIFIER LETTER SMALL PHI 
= latin superscript small letter phi 
# <super> 0278 
1DB3 MODIFIER LETTER SMALL S WITH HOOK 
= latin superscript small letter s with hook 
# <super> 0282 
1DB4 MODIFIER LETTER SMALL ESH 
= latin superscript small letter esh 
# <super> 0283 
1DB5 MODIFIER LETTER SMALL T WITH PALATAL HOOK 
= latin superscript small letter small t with palatal hook 
# <super> 01AB 
1DB6 MODIFIER LETTER SMALL U BAR 
= latin superscript small letter u bar 
# <super> 0289 
1DB7 MODIFIER LETTER SMALL UPSILON 
= latin superscript small letter upsilon 
# <super> 028A 
1DB8 MODIFIER LETTER SMALL CAPITAL U 
= latin letter small capital u 
* not for use in UPA 
x (modifier letter capital u - 1D41) 
# <super> 1D1C 
1DB9 MODIFIER LETTER SMALL V WITH HOOK 
= latin superscript small letter v with hook 
# <super> 028B 
1DBA MODIFIER LETTER SMALL TURNED V 
= latin superscript small letter turned v 
# <super> 028C 
1DBB MODIFIER LETTER SMALL Z 
= latin superscript small letter z 
# <super> 007A 
1DBC MODIFIER LETTER SMALL Z WITH RETROFLEX HOOK 
= latin superscript small letter z with retroflex hook 
# <super> 0290 
1DBD MODIFIER LETTER SMALL Z WITH CURL 
= latin superscript small letter z with curl 
# <super> 0291 
1DBE MODIFIER LETTER SMALL EZH 
= latin superscript small letter ezh 
# <super> 0292 
1DBF MODIFIER LETTER SMALL THETA 
= latin superscript small letter theta 
# <super> 03B8 
[…] 
@ Additions for Extended IPA 
A7F8 MODIFIER LETTER CAPITAL H WITH STROKE 
= latin superscript capital letter h with stroke 
* faucalized 
# <super> 0126 
A7F9 MODIFIER LETTER SMALL LIGATURE OE 
= latin superscript small ligature oe 
* labialized: open-rounded 
# <super> 0153 
[…] 
@ Modifier letters for German dialectology 
AB5B MODIFIER BREVE WITH INVERTED BREVE 
x (breve - 02D8) 
x (close up - 2050) 
x (metrical breve - 23D1) 
AB5C MODIFIER LETTER SMALL HENG 
= latin superscript small letter heng 
# <super> A727 
AB5D MODIFIER LETTER SMALL L WITH INVERTED LAZY S 
= latin superscript small letter l with inverted lazy s 
# <super> AB37 
AB5E MODIFIER LETTER SMALL L WITH MIDDLE TILDE 
= latin superscript small letter l with middle tilde 
# <super> 026B 
AB5F MODIFIER LETTER SMALL U WITH LEFT HOOK 
= latin superscript small letter u with left hook 
# <super> AB52

Date/Time: Sun Apr 30 10:02:43 CDT 2017
Name: Marcel Schneider
Report Type: Error Report
Opt Subject: TUS typo

Sorry for not collecting typo reports into one single batch:

TUS p. 445 3rd lign before end of page:
"are" is repeated.

Regards,
Marcel

Date/Time: Sun Apr 30 10:05:56 CDT 2017
Name: Marcel Schneider
Report Type: Error Report
Opt Subject: U+1091F

Order is not enforced by Unibook, although traditional:

@		Punctuation
1091F	PHOENICIAN WORD SEPARATOR
	x (full stop - 002E)
	x (middle dot - 00B7)
	x (word separator middle dot - 2E31)
	* sometimes shown with a glyph for a short vertical bar

Should annotation be raised below the name?

Date/Time: Sun Apr 30 10:09:34 CDT 2017
Name: Marcel Schneider
Report Type: Error Report
Opt Subject: Annotations about African usage of open o and e

Shouldnʼt open o and epsilon (open e) get annotations referring to African usage?
Sample result:

0254	LETTRE MINUSCULE LATINE O OUVERT
	* voyelle mi-basse postérieure arrondie
	* langues d’Afrique
	* en vieux danois, « 0254: » signifie « c’est-à-dire »
	* typographiquement un c réfléchi
	* la majuscule est 0186

025B	LETTRE MINUSCULE LATINE EPSILON
	= e ouvert
	* voyelle mi-basse antérieure étirée
	* langues d’Afrique
	* la majuscule est 0190
	x (lettre minuscule grecque epsilon - 03B5)

Date/Time: Sun Apr 30 10:12:12 CDT 2017
Name: Michael Everson
Report Type: Public Review Issue
Opt Subject: Khitan Small Script

Two elements are relevant here. The first is that the proposed formatting
characters at 18CFE and 18CFF do not have the consensus of the user community,
so they should be removed for further study.

The second has to do with the radicals. Khitan Small Script has 20 radicals, 8
of which have the same glyph appearance as characters used in Khitan words,
but 12 of which are used only as radicals. The Jurchen script has 50 radicals,
15 of which are identical to Khitan Small Script radicals, and only 1 of which
has the same glyph appearance as any of the 800-900 Jurchen characters. The
full repertoire of Khitan Large Script radicals is as yet uncertain, but at
least 1 of these is unique to Khitan Large Script, and 1 is identical to a
Jurchen radical.

It does not make sense to duplicate or triplicate radicals for Khitan Small
Script, Jurchen, and Khitan Large Script. All that is required by modern users
is a single "Ideographic Radicals" block, where the radicals used for these
three closely-related scripts can be properly unified, with the property
script=Common.

Date/Time: Sun Apr 30 10:15:45 CDT 2017
Name: Michael Everson
Report Type: Public Review Issue
Opt Subject: Feedback on Unicode 10.0 beta: 1F3B1 BILLIARDS

The beta includes a proposed glyph change for 1F3B1 BILLIARDS. The source for
this character in both SoftBank and KDDI is, properly ビリヤード, biriyādo, that is
‘billiards’. The reference glyphs for both of those is a green game table with
coloured balls, with a cue in the SoftBank glyph. It has always been supposed
that this character could be used for pocketed or pocketless billiards,
snooker, pool, carom, and other cue games. No indication of the “eight ball”
is given in the Japanese sources or the original reference glyph, and indeed,
numbered balls are not used in many cue sports, like snooker and carom.
Apparently, some (but not all) vendors have represented this character with an
eight-ball, but it appears that this is not because this is the most
representative glyph for “billiards” but because of the “Magic 8 Ball” oracle
toy, popular especially in the United States, where it seems to have been
introduced in 1950, though a “magic ball” was used in a 1940 Three Stooges
short called “You Nazty Spy”.

The original glyph for BILLIARDS should be retained unchanged, and if
necessary a new character, EIGHT BALL, can be encoded at a suitable place in
the Supplemental Symbols and Pictographs block. Vendor replacement of the
original reference glyph constitutes a distortion of the intended meaning of
the character, and is therefore an error. Any precedent agreeing that vendors
may alter the meaning of symbols, would be a very bad precedent, and this
should not be encouraged.

Date/Time: Sun Apr 30 10:17:26 CDT 2017
Name: Marcel Schneider
Report Type: Error Report
Opt Subject: Annotations at block start for IPA

In French there are more annotations at block start for IPA, that have now
grown to the below. Would this be interesting in the Code Charts too?

For your convenience Iʼd like to inform you that actually a revised French
version is located as an HTML page view at:
http://dispoclavier.com/caras/index.html

@@	0250	Alphabet phonétique international	02AF
@+		Pour diversifier son alphabet, l’API utilise d’une part la diacritation, 
en particulier : « barré » = avec trait oblique ; « rayé » = avec trait horizontal ; et 
d’autre part les transformations du plan, dont les principaux attributs sont : « inversé » = 
symétrie par rapport à l’axe horizontal ; « réfléchi » = symétrie par rapport à l’axe vertical ; 
« tourné » = rotation de 180° ; « couché » = rotation de 90° : en sens horaire = « couché à 
droite » ; en sens antihoraire = « couché à gauche » (dans la plupart des cas).
;0
@		Compléments pour l’API
@+		Plusieurs lettres de l’API sont entrées dans l’orthographe de nombreuses langues, 
dont quelques-unes sont citées en exemples dans les commentaires. L’API inclut les lettres latines 
de base, ainsi que des lettres latines et des lettres grecques codées dans les blocs Latin-1 
complémentaire (0080..00FF), Latin étendu A (0100..017F), ou Grec et copte (0370..03FF). §07.1
		x (minuscule latine e dans l’a - 00E6)
		[…]

Best regards,

Marcel

Date/Time: Sun Apr 30 10:26:30 CDT 2017
Name: Marcel Schneider
Report Type: Error Report
Opt Subject: U+0258 and U+018E, U+01DD

Is there a need to cross-reference this:
0258	LATIN SMALL LETTER REVERSED E
	* upper-mid central unrounded vowel
at this:
018E	LATIN CAPITAL LETTER REVERSED E
	= turned e
	* Pan-Nigerian alphabet
	* lowercase is 01DD
and:
01DD	LATIN SMALL LETTER TURNED E
	* Pan-Nigerian alphabet
	* all other usages of schwa are 0259
	* uppercase is 018E
	x (latin small letter schwa - 0259)

U+0258 has no uppercase, but the whole could get confusing. In French Iʼve put it this way:

018E	LETTRE MAJUSCULE LATINE E TOURNÉ
	= e réfléchi
	* alphabet pannigérian
	* la minuscule est 01DD
	* typographiquement l’œil est tourné, non réfléchi
	x (lettre minuscule latine e tourné - 01DD)
	x (lettre minuscule latine e réfléchi - 0258)
018F	LETTRE MAJUSCULE LATINE SCHWA
	= lettre majuscule latine chva
	* azéri, …
	* la minuscule est 0259
	* parfois pris pour une variante, 018E LETTRE MAJUSCULE LATINE E TOURNÉ est la 
majuscule de 01DD LETTRE MINUSCULE LATINE E TOURNÉ
	x (lettre majuscule cyrillique schwa - 04D8)
[…]
01DD	MINUSCULE LATINE E TOURNÉ
	* alphabet pannigérian
	* cette minuscule a même œil que 0259 MINUSCULE LATINE SCHWA, mais elle doit être 
distinguée parce que leurs majuscules sont différentes
	* la majuscule est 018E
	x (minuscule latine schwa - 0259)
	x (minuscule latine e réfléchi - 0258)

Date/Time: Sun Apr 30 10:53:30 CDT 2017
Name: Michael Everson
Report Type: Public Review Issue
Opt Subject: Feedback on Unicode 10.0 beta: 1F996 T-REX

The proposed characters 1F995 SAUROPOD and 1F996 T-REX are not satisfactory.
In the first place, the name SAUROPOD refers to a clade of the suborder of
Sauropodomorpha of the order Saurischia, which includes Apatosaurus,
Brachiosaurus, Brontosaurus, Diplodocus, and many other species. Such an
umbrella is perfectly reasonable, and Unicode has done that even for OCTOPUS
(referring to some 300 species). “T-REX” on the other hand refers to a
particular species, which is too precise. Moreover, as an abbreviation, the
hyphen is never used in scientific names (the correct form would be T. rex)
and in the context of a formal Unicode name TYRANNOSAURUS REX would be the
correct term (however it might be presented to an end user in a format like
:t-rex: or whatever). A slang abbreviation is not a suitable name for formal
standardization. It is, of course, suitable for an informative note.

Beyond this, however, is the fact that a group like Sauropod and an individual
species like Tyrannosaurus rex do not form anything like a coherent group that
epitomizes “dinosaur”. Many millions of people admire dinosaurs, and it’s
quite common to find that people have had one or more favourite dinosaurs from
childhood. “Where’s my Ticeratops?” “Why isn’t there an Iguanodon?” will
surely be some of the first reactions to the standardization of only two
pictographs in this block. On the other hand, a properly complete set will
certainly be very popular indeed.

Evidently some very basic proposals had been made to encode some “dinosaurs”
as emoji, but singling out two simply makes no sense. The Unicode Standard
includes many mammal symbols, and work seems to be ongoing to identify a
larger and larger set of them, based on evident familiarity, metaphor, and
desirability indicating some expected use. It appears that many successful
emoji proposals, at least in part, have not been based on systematic analysis
or even on internet discussions about missing emoji, but rather on Instagram
and Google Trends data based on word frequency. This does not seem to be an
entirely sufficient criterion, particularly as emoji are often used
metaphorically, and outside of metaphor words may be used for all sorts of
ordinary reasons. It may be useful to note that the word “cricket” is probably
far, far more commonly used on the internet for the sport than for the insect.
Do many people use the word “Sauropod” in speech? Quite likely they do not,
but the class of “dinosaurs” is comprised of a number of familiar groups, and
a relatively small number of encoded pictographs would suffice to represent
that group.

The set of existing Unicode emoji symbols for the kingdom Animalia is not very
well balanced. This is nobody’s fault. The set began with animals implemented
in late-90s Japanese telecom sets. This was augmented by German and Irish
National Body comments to ISO/IEC 10646 ballots adding more animals for, for
example, a complete set of characters used in the Asian Zodiac. (That is why
there is a crocodile encoded, for instance; this represents the Dragon in
Kazakhstan.) Since then some more animals have been added. The current Unicode
Standard (including the content of Unicode 10.0 beta) has, in the emoji
classification:

1 amphibian
13 birds
11 “bugs” (9 arthropods, 1 mollusc, and 1 architectural device made out of a proteinaceous extrusion)
48 mammals (47 mammals and 1 pair of paw prints)
12 “marine” creatures (including 4 fish, 3 mammals, 2 arthropods, 2 molluscs, and a mollusc shell)
6 reptiles (including 2 dragons).

Where the two dinosaur characters would be classified is uncertain. Perhaps
the sauropod would be classed as a reptile, and the tyrannosaur as a bird
(both go back to a subgroup Tyrannoraptora).

Levity aside, there are, according to Mammal Species of the World, 5,416
species of mammals identified in 2006. These were grouped into 1,229 genera,
153 families and 29 orders. While it is likely that more mammal pictographs
could be added to Unicode, 45 isn’t a bad start. It’s unlikely that symbols
for 1200 genera would be needed. For dinosaurs, however, the number of genera
is much smaller (about 500) and there too, it is unlikely that a great many
symbols would be needed. But given their popularity, it seems that certainly
more than two is necessary.

Described as dragons in the Western Jin Dynasty, dinosaurs have fascinated our
culture for a very long time. Modern study of dinosaur has done so no less,
and popular culture is permeated by them: noteworthy are Jules Verne’s 1864
Journey to the Centre of the Earth (Ichthyosaurs, Plesiosaurs; Dimetrodon was
in the 1959 film of this book); Arthur Conan Doyle’s 1912 The Lost World
(Ichthyosaurs, Iguanodon, Plesiosaurs, Pterosaurs, Sauropods, Stegosaurians,
some carnivorous Therapods); the 1933 film King Kong (Ceratopsians,
Plesiosaurs, Pterosaurs, Sauropods, Stegosaurians, Tyrannosaurids); many
others, until more modern scientific findings about dinosaurs found their way
into Michael Crichton’s 1990 novel Jurassic Park and the films that were based
on it. Some non-dinosaur characters also have high visibility in popular
culture. Two of these, the MAMMOTH and DODO, are commonly used in ordinary
phrases: “a mammoth sale”, “as dead as a dodo”.

The character names given below are chosen from the standard scientific
taxonomy, and so the most identifiable species in each class of dinosaur are
reflected with accurate nomenclature. Thus there are some genera, families,
superfamilies, suborders, orders, and clades represented.

Because the characters proposed below represent the most iconic and popularly
identifiable dinosaurs, the UTC would not expect further requests to encode
additional ones. It seems that a set of 18 symbols representing dinosaurs and
some other prehistoric creatures would do well to fill in the gaps implied by
SAUROPOD and T-REX. Encoding only those 2 characters at this time would simply
lead to calls to fill the gaps, but that can be easily done, as shown here. A
species-based nomenclature would be possible but less advantageous. If
TRICERATOPS were encoded, the glyph should really not be of a Protoceratops or
Styracosaurus. CERATOPSIAN gives glyph designers more choice.

Dinosaurs and other prehistoric reptiles

1F9A0 ANKYLOSAUR
• a suborder of the order Ornithischia
1F9A1 ARCHAEOPTERYX
• a genus of the suborder Theropoda of the order Saurischia
1F9A2 CERATOPSIAN
• a suborder of the order Ornithischia
• includes Protoceratops, Styracosaurus, Triceratops
1F9A3 DROMAEOSAURID
• a family of the suborder Theropoda of the order Saurischia
• includes Deinonychus, Utahraptor, Velociraptor
1F9A4 HARDROSAURID
• a family of the suborder Ornithopoda of the order Ornithischia
• includes Edmontosaurus, Hadrosaurus, Parasaurolophus
1F9A5 ICHTHYOSAUR
• a member of the order Ichthyosauria
1F9A6 IGUANODON
• a genus of the suborder Ornithopoda of the order Ornithischia
1F9A7 ORNITHOMIMID
• a family of the suborder Theropoda of the order Saurischia
• includes Gallimimus, Ornithomimus, Struthiomimus
1F9A8 PACHYCEPHALOSAUR
• a family of the suborder Pachycephalosauria of the order Ornithischia
• includes Pachycephalosaurus, Stegoceras
1F9A9 PLESIOSAUR
• a member of the order Plesiosauria
• includes Elasmosaurus, Liopleurodon, Plesiosaurus
• represents Nessie, the Loch Ness Monster
1F9AA PTEROSAUR
• a member of the order Pterosauria
• includes Pteranodon, Pterodactylus, Quetzalcoatlus
1F9AB SAUROPOD
• a clade of the suborder of Sauropodomorpha of the order Saurischia
• includes Apatosaurus, Brachiosaurus, Brontosaurus, Diplodocus
1F9AC SPHENACODONTID
• a genus of the family Sphenacodontidae
• includes Ctenospondylus, Dimetrodon, Secodontosaurus, Sphenacodon
1F9AD STEGOSAURIAN
• a suborder of the order Ornithischia
• includes Huayangosaurus, Kentrosaurus, Stegosaurus
1F9AE TYRANNOSAURID
• a superfamily of the suborder of Theropoda of the order Saurischia
• includes Albertosaurus, Gorgosaurus, Tyrannosaurus (T. rex)

Extinct creatures
1F9AF SABRE-TOOTHED CAT
= Smilodon
1F9B0 MAMMOTH
• indicates great size
1F9B1 DODO
• indicates extinction

Date/Time: Sun Apr 30 10:59:46 CDT 2017
Name: Michael Everson
Report Type: Public Review Issue
Opt Subject: Feedback on Unicode 10.0 beta: Multiple-person emojis

1F46F WOMAN WITH BUNNY EARS, 1F486 FACE MASSAGE, 1F931 BREAST-FEEDING, 1F93C
WRESTLERS all show multiple people in them (despite the name of the first) and
the Fitzpatrick modifiers cannot be applied to them in the way that they are
usually used. It is certainly the case that 1F931 cannot be rendered sensibly
without two people in them, but of course it is the case that the appearance
of the mother and child is not always identical. A solution should be found
for this.

Date/Time: Sun Apr 30 11:10:21 CDT 2017
Name: Michael Everson
Report Type: Public Review Issue
Opt Subject: Feedback on Unicode 10.0 beta: 1F9DD ELF

Elves have a variety of shapes depending on mythology. They may be tall (as in
Middle-earth), and they can be short (as in the North Pole). And there are
other variations. This character is likely to be used alongside 1F596 anyway
(as in the United Federation of Planets), and so a better name for it would be
PERSON WITH POINTED EARS.

Date/Time: Sun Apr 30 17:32:58 CDT 2017
Name: Richard Wordingham
Report Type: Public Review Issue
Opt Subject: Unicode 10.0.0 Beta Review

My Beta Review comments for Unicode 8.0.0, at
http://www.unicode.org/review/pri297/feedback.html, still apply to proposed
property values for Unicode 10.0.0.  I submitted a detailed document for most
of the suggested Tai Tham changes earlier today, 30 April 2017 (BST).

Date/Time: Mon May 1 09:24:09 CDT 2017
Name: Marcel Schneider
Report Type: Other Question, Problem, or Feedback
Opt Subject: Descriptor Property

Hello,

Yesterday I sadly forgot one feedback item that having come last wasnʼt added
to my draft. It is for consideration by the UTC, so if you are just about to
consolidate the items, would it be possible to append this one at the end?

I propose to add a new character property, likewise another one is being added
to Unicode 10.0, but this new one would help a great deal about many many
characters that had been given suboptimal names. Indeed, the Descriptor
property allows to implement in the UCS what otherwise only translations can
achieve: make for a useful approach in English for all end users who have
learned English, that is almost everybody dealing with IT in one way or
another.

The addition of the Descriptor property is one of three measures to be taken
today (the other two being about typography and keyboard layouts, thus out of
the scope of Unicode). The reason is that by the time, a handful people here
and another handful people there enforced names that have become useless since
Unicode introduced bidi-mirroring, or that have been designed inconsistently
to restrict the use of some characters, or that are otherwise misleading and
of no use but had been handled privately by some people who pushed them into
some standard that was subsequently to be referred to by Unicode or ISO/IEC
JTC1 SC2 WG2 but inconsistently, i.e. without feedback from concerned ISO
member countries, or with altered feedback due to interference of some IT
company for purely financial considerations against the interests of their
native country.

Should that not be already clear enough, here is the formula Iʼm actually
using: Billions of innocent newcomers are bothered with junk names, junk
typography and junk keyboard layouts only because by the time, a handful of
folks were too selfish.

Therefore, completing the panel of features that make Unicode a great
Standard, by adding the Descriptor property and giving it a good value for
every character while not subjugating it to stability policies, the UTC will
be enabled to fully endorse cultural responsibilities.

Best regards,

Marcel Schneider

Date/Time: Mon May 1 14:35:01 CDT 2017
Name: David Corbett
Report Type: Error Report
Opt Subject: Tifinagh contextual shaping

Section 19.3 subsection “Contextual Shaping” discusses how the Tifinagh
letters yal and yan are lowered or slanted in ambiguous contents. However,
according to L2/04-142, this does not apply in Neo-Tifinagh where these
letters are never ambiguous (except, according to L2/10-278, in Algeria). The
wording in The Unicode Standard implies that this contextual shaping is de
rigueur for Tifinagh fonts, whereas it seems that general-purpose fonts should
use the Neo-Tifinagh conventions (as Noto Sans Tifinagh, for example, does).
The subsection should be reworded to explain that this contextual shaping only
applies to traditional variants of the script.

Date/Time: Tue May 2 13:24:57 CDT 2017
Name: Lorna Evans
Report Type: Error Report
Opt Subject: Combining class for U+08D9

I just realized the combining class for U+08D9 is wrong.
08D9;ARABIC SMALL LOW NOON WITH KASRA;Mn;230;NSM;;;;;N;;;;;

It should be 220 and not 230

Lorna

Date/Time: Wed May 3 10:32:38 CDT 2017
Name: David Corbett
Report Type: Error Report
Opt Subject: Table 19-3. N’Ko Diacritic Usage

Table 19-3 lists the phonetic values of various combinations of consonants and
diacritics. The phonetic notation is almost the IPA; I suggest using the IPA
properly. ⟨ḫ⟩ and ⟨ḥ⟩ are not IPA symbols. If ⟨yʰ⟩ is meant to be a consonant,
the ⟨y⟩ should be ⟨j⟩. The normal ⟨g⟩ should be the script ⟨ɡ⟩. I don’t know
what ⟨ʰ⟩ represents in this table, but the IPA symbol ⟨ʰ⟩ denotes aspiration
which would be unusual for these voiced consonants. It might represent the
digraphs in the pre-1989 Guinean alphabet, in which case ⟨bʰ⟩ should be ⟨ɓ⟩
and ⟨yʰ⟩ should be ⟨ʔʲ⟩.

But you’ll need someone who knows about N’Ko to tell you the intended phones;
all I know is the table doesn’t follow a standard.

Date/Time: Wed May 3 11:36:43 CDT 2017
Name: David Corbett
Report Type: Error Report
Opt Subject: Vai line breaking

The standard says that “Line breaking opportunities can occur between most 
characters except that line breaks should not occur before U+A60B VAI SYLLABLE NG 
used as a syllable final, or before U+A60C VAI SYLLABLE LENGTHENER (which is 
always a syllable final)”, but all Vai letters have line break class AL, so 
Vai line breaking does not work.

Date/Time: Wed May 3 11:52:01 CDT 2017
Name: David Corbett
Report Type: Error Report
Opt Subject: Table 19-6. Number Formation in Mende Kikakui

The row in table 19-6 explaining the number 206 calls the first character “TWO”. 
It should read “U+1E8C8 MENDE KIKAKUI DIGIT TWO” to be consistent with the rest 
of the table.

Date/Time: Wed May 3 12:32:30 CDT 2017
Name: David Corbett
Report Type: Error Report
Opt Subject: Typo in Khojki section


“diagraphs” should be “digraphs”.

Date/Time: Wed May 3 13:36:43 CDT 2017
Name: David Corbett
Report Type: Error Report
Opt Subject: Typo in Modi section

“Calutta” should be “Calcutta”.

Date/Time: Wed May 3 15:04:29 CDT 2017
Name: David Corbett
Report Type: Error Report
Opt Subject: Problems in Tibetan section

> > When the consonant “ra” is written in the “head” position (ra-mgo, 
> > pronounced ra-go) at the top of a stack in the normal Tibetan-defined lettering 
> > set, the shape of the consonant can change. This is called ra-go (ra-mgo).

It is not necessary to explain that it is called ra-go twice in a row.

“wazur” and “ngaro” should each have a hyphen to represent the tsek.

When “a-chung” is in roman type, representing the transliteration instead of the 
pronunciation, it should begin with an apostrophe, i.e. “’a-chung”.

“go-yig (yig-mgo)”: in both places this string appears, the pronunciation doesn’t 
match the transliteration. In the second place, the pronunciation should be italicized.

Date/Time: Thu May 4 17:13:50 CDT 2017
Name: David Corbett
Report Type: Error Report
Opt Subject: Batak orthographic syllable with inherent vowel

Section 17.6 defines the Batak orthographic syllable as C(V(C_s|C_d)). Should that 
be C(V)(C_s|C_d)? The current definition says that ᯇᯧᯰ (peng) is an orthographic 
syllable, but ᯇᯰ (pang) is not.

Date/Time: Thu May 4 20:56:46 CDT 2017
Name: David Corbett
Report Type: Error Report
Opt Subject: Vagueness in Thai section

> > Use of combining diacritics with the Thai script, such as U+0331 COMBINING
> > MACRON BELOW and U+0303 COMBINING TILDE, imposes additional constraints
> > for rendering systems for Thai. This is because the canonical ordering of
> > these marks with respect to Thai vowels and tone marks may put them in
> > orders which require rearranging during rendering.

It “may” require rearranging? How can I know for sure? This is just enough
information to make the implementer of a rendering system worried; it doesn’t
actually say what to do. It should explain how these common-script diacritics
visually interact with Thai diacritics in Patani Malay. The answer given in
L2/10-451 is that U+0331 appears between a consonant and a vowel below, and
that U+0303 never appears with diacritics above so it doesn’t matter what the
renderer does. Contrariwise, http://thep.blogspot.com/2015/08/patani-malay-
support-in-fonts-tlwg-with.html says that U+0303 can appear with tone marks
and appears between a consonant and a tone mark.

Date/Time: Thu May 4 21:36:34 CDT 2017
Name: David Corbett
Report Type: Error Report
Opt Subject: Minor problems in Khmer section

Not all the clusters in the “Qa with Vowel Sign” column of Table 16-5 include the letter qa.

The IPA throughout the Khmer section uses ⟨y⟩ where it should use ⟨j⟩.

Date/Time: Thu May 4 22:50:05 CDT 2017
Name: David Corbett
Report Type: Public Review Issue
Opt Subject: PRI #341: Pau Cin Hau final tone marks

According to the proposal document for Pau Cin Hau, the tone marks with “FINAL” in 
their names mark the ends of sentences. They are simultaneously letters and punctuation. 
For example, ⟨𑫆𑫙𑫣𑫭 𑫊𑫕𑫤⟩ has a sentence break in it. Therefore, these 10 characters 
should be given the STerm property.

Date/Time: Fri May 5 13:46:46 CDT 2017
Name: David Corbett
Report Type: Error Report
Opt Subject: PRI #341: More sentence-terminal punctuation

According to The Unicode Standard, U+061E, U+0837, U+0839, U+083D, U+083E, U+10B3C, 
U+10B3E, and U+11C71 mark the ends of sentences or larger units. They should therefore 
have Sentence_Terminal=Yes.

Date/Time: Fri May 5 14:12:03 CDT 2017
Name: David Corbett
Report Type: Error Report
Opt Subject: Typo in Arabic section


“associatrd” should be “associated”.

Date/Time: Sun May 7 07:03:39 CDT 2017
Name: Shriramana Sharma
Report Type: Submission (FAQ, Tech Note, Case Study)
Opt Subject: Addition to ScriptExtensions

In ScriptExtensions.txt of Unicode 9.0, I note:

11303         ; Gran Taml # Mc       GRANTHA SIGN VISARGA
1133C         ; Gran Taml # Mn       GRANTHA SIGN NUKTA

I submit that 11301 GRANTHA SIGN CANDRABINDU should be added to above.
See L2/10-407 p 4 picture right side three entries above last.