The sections below contain links to permanent feedback documents for the open Public Review Issues as well as other public feedback as of October 31, 2015, since the previous cumulative document was issued prior to UTC #145 (October 2015). Grayed-out items in the Table of Contents do not have feedback here.
The links below go directly to open PRIs and to feedback documents for them, as of January 22, 2016. Gray rows have no feedback to date.
Issue Name Feedback Link 316 Proposal to Remove Some Hira/Kata From Script_Extensions (feedback) 315 Proposed Update UAX #9, Unicode Bidirectional Algorithm (feedback) 314 Proposed Update UAX #45, U-Source Ideographs (feedback) 313 Proposed Update UTS #39, Unicode Security Mechanisms (feedback) 312 Feedback on Draft additional repertoire for ISO/IEC 10646:2016 (5th edition) CD2 (feedback) 311 Proposed Update UTS #10, Unicode Collation Algorithm (feedback) 310 New Character Property for Prepended Concatenation Marks (feedback) 308 Property Change for U+202F NARROW NO-BREAK SPACE (NNBSP) (feedback) no new 307 Proposed Update UAX #38, Unicode Character Database (feedback) 306 Proposed Update UAX #29, Unicode Text Segmentation (feedback) no new 305 Proposed Update UAX #44, Unicode Character Database (feedback) 304 Proposed Update UAX #24, Unicode Script Property (feedback) 303 Proposed Update UAX #31, Unicode Identifier and Pattern Syntax (feedback) no new
The links below go to locations in this document for feedback.
Feedback to UTC / Encoding Proposals
Feedback on UTRs / UAXes
Date/Time: Mon Dec 14 18:03:43 CST 2015
Name: Raph Levien
Report Type: Error Report
Opt Subject: Not all emoji ZWJ sequences supported on OSX 10.11
The emoji sequences that include U+2764 but no U+FE0F variation selector do not render correctly in Mac OS X 10.11 (El Capitan). Repro steps: download http://www.unicode.org/Public/emoji/2.0//emoji-zwj-sequences.txt, open in TextView. The top set all render correctly (a single compound emoji representing the sequence). The bottom set (which all have U+2764 but not a following U+FE0F) split into individual emoji. I recommend that the emoji-zwj-sequences.txt data file indicates that the bottom set is not reliably rendered on all platforms, even those that in general aggressively implement Unicode 8 emoji and zwj sequences.
Date/Time: Thu Jan 21 12:15:00 CST 2016
Name: Doug Ewell
Report Type: Feedback on an Encoding Proposal
Opt Subject: Feedback on L2/16-008, "Unicode-Specified Emoji Customizations"
With regard to the choice of U+E007E TAG TILDE as a terminator of emoji tag sequences, L2/16-008 states, "NOTE: if we un-deprecated U+E007F CANCEL TAG in Unicode v9.0, we could use that for the terminator, which would be slightly more natural." The current working draft on Google Docs strengthens this to "... which would be a more natural choice." U+E007F CANCEL TAG was originally intended to mark the end of a language- tagged block of text. As such, the usage suggested in L2/16-008 to mark the end of a tag sequence is very similar, although not identical. The character is already encoded and un-deprecating it would be a comparatively inexpensive operation for UTC, and like the earlier un-deprecation of U+E0020 through U+E007E, it would not imply any manner of support for the older language- tagging concept. The current choice of TAG TILDE is arbitrary and could potentially be a source of confusion, given that the CLDR validity files for region and subdivision (intended to be used in validating flag tag sequences) use an ASCII tilde for a completely different purpose, to indicate ranges. I support removing the deprecated status of U+E007F CANCEL TAG and assigning it as the terminator character described in L2/16-008. As a side note, there are discrepancies in the terminology used in L2/16-008 to define tag sequences. The chart of "special terms" includes 'tag-term' and 'tag-nterm', but in the following ABNF and subsequent examples, these are changed to 'Tag-STOP' and 'tag-nt' respectively. Disregarding the differences in capitalization, the actual labels need to be made consistent.
Date/Time: Sun Jan 24 10:13:51 CST 2016
Name: A.R.Amaithi Anantham
Report Type: Error Report
Opt Subject: L2/15-256 and L2/16-030
Sir, It is proposed to use Tamil Nutka to represent Sonants, in Tribal languages (Vide Unicode Document Number L2/15-256 and L2/16-030). In Tamil Script, Diacritics are not to be allowed. Therefore Diacritics are not to be used. The only way is to make use of concerned code points, in Tamil Block, which are not made use of so far, for Sonants. Therefore I am proposing the above Sonants, at the code points, as noted below, in Tamil Block: (1) Code point 0B87 for ௯ (G), (2) Code Point 0BA1 for ௰ (DD), (3) Code Point 0BA6 for ௲ (D), (4) Code Point 0BAC for ௱ (B). The proposals of either Tamil Nutka or Diacritics are not, at all, needed. With Regards A.R.Amaithi Anantham
Date/Time: Mon Jan 25 23:55:42 CST 2016
Name: Agustin Fonts
Report Type: Feedback on an Encoding Proposal
Opt Subject: Feedback on L2/16-022 Condom Emoji Submission
We understand that Unicode would be considering safe sex as a part of emoji communications. However, we believe that limiting safe sex emojis to the condom is too restrictive. There are many other ways to practice safe sex for both men and women. Limiting such emojis to a condom emoji may indicate to users that safe sex is the sole responsibility of men and/or fully ensured by the condom. Giving users such an impression is not safe or inclusive. We would like to strongly recommend that Unicode not restrict the expression of safe sex to the condom, which would be just a marketing platform for condom manufacturers, but rather to create a safe sex category designed to promote safe sex for all genders.
Date/Time: Tue Jan 12 15:34:25 CST 2016
Name: Andy Heninger
Report Type: Error Report
Opt Subject: UAX 14 break rules for numbers
The following originated as an ICU bug report from Bernhard Fey, but the problem actually stems from the UAX 14 line break rules. http://bugs.icu-project.org/trac/ticket/12017 The break positions found in the text "start .789 end" are not so good. With the default UAX rules the breaks would be |start .789 |end| (LB 13 prevents a break before the '.'; LB 25 prevents after.) With the suggested regular expression tailoring for numbers, used by ICU, they are |start .|789 |end| The correct breaking would be |start |.789 |end| How best to fix the problem will take some thought.
Date/Time: Sun Jan 10 16:08:47 CST 2016
Name: Shai Berger
Report Type: Submission (FAQ, Tech Note, Case Study)
Opt Subject: FAQ about the UBA and Higher Level Protocols
Dear Unicode editorial committee, Here is a Q&A pair for your consideration: Q: When can a Higher-Level Protocol be used to override the default rules of the UBA? A: Higher-Level Protocols apply in specialized contexts such as marked-up text, specific fields in forms, or specific fields in messages complying with pre-set formats. Generally, you can say some Higher-Level Protocol applies to a piece of text if all users of that piece in that context agree on the rules and semantics dictated by that protocol. As soon as some text's interpretation is governed by a Higher-Level Protocol, that text is no longer plain text. In particular, a program is not a protocol -- if a program claims to be a plain- text viewer, but presents all paragraphs with base direction LTR, it is not compliant with the Unicode standard. Explanation and rationale: The Unicode Bidi Algorithm, as specified in http://www.unicode.org/reports/tr9/, specifies a default algorithm for setting the base direction of a paragraph, but allows Higher Level Protocols to override this (http://www.unicode.org/reports/tr9/#HL1). This has been interpreted by some software developers as permission to pick the base direction using their own rules when dealing with plain text, claiming, essentially, that their program is a higher-level protocol. Probably the most common example of such a program is Microsoft Outlook, which (for sure in versions up to and including Outlook 2010, but AFAIK to this day) allows its user to specify what base direction to give to all plain-text messages it reads or writes; this direction can be "auto", but if it is "RTL" or "LTR", UBA rules P2 and P3 are ignored. As you may imagine, this creates interoperability problems, to the point that many Hebrew users feel that plain-text is not an appropriate format for writing Hebrew mails. My own view is that you cannot apply Higher-Level Protocols to plain text and still call it plain text; I think this follows from the dictionary definitions of the word "protocol" and the term "plain text". I also think plain text is required to "forbid" higher-level protocols by the emphasized remark on page 19 of the Unicode standard: "Plain text must contain enough information to permit the text to be rendered legibly, and nothing more." As a Free Software enthusiast, I spent years thinking this was just another example of Microsoft's disrespect for standards, but recently I've encountered free-software developers, members of the relevant Israeli standards committee, who espouse the idea that a program can be a higher-level protocol; that according to the Unicode standard, bidirectional plain-text is, in general, not enough to determine the correct presentation. So, I mentally apologize to Microsoft for ascribing them either malice or incompetence in this matter; but I'd like to have the issue resolved. I am suggesting that my understanding be published as a FAQ, assuming that, indeed, this is what the designers of the standard intended. If I am wrong, a clarification going the other way would be very welcome as well. Thanks in advance, Shai.