The sections below contain links to permanent feedback documents for the open Public Review Issues as well as other public feedback as of July 24, 2012, since the previous cumulative document was issued prior to UTC #133 (November 2012). This document does not include feedback on moderated Public Review Issues from the forum that have been digested by the forum moderators; those are in separate documents for each of the PRIs. Gray items in the Table of Contents do not have feedback here.
The links below go to directly to open PRIs and to feedback documents for them, as of January 25, 2013.
Issue Name (+ feedback links) 228 Changing some common characters from Punctuation to Symbol 232 Proposed Update UAX #9, Unicode Bidirectional Algorithm 235 Proposed Update UTS #10, Unicode Collation Algorithm 236 Proposed Update UAX #11, East Asian Width (no feedback) 237 Proposed Update UAX #14, Unicode Line Breaking Algorithm 238 Proposed Update UAX #15, Unicode Normalization Forms (no feedback) 239 Proposed Update UAX #24, Unicode Script Property (no feedback) 240 Proposed Update UAX #29, Unicode Text Segmentation 241 Proposed Update UAX #31, Unicode Identifier and Pattern Syntax 243 Proposed Update UAX #38, Unicode Han Database (Unihan) (no feedback) 244 Proposed Update UAX #41, Common References for Unicode Standard Annexes (no feedback) 246 Proposed Update UAX #44, Unicode Character Database 247 Proposed Update UAX #45, U-Source Ideographs (no feedback)
The links below go to locations in this document for feedback.
Feedback on Encoding Proposals
Closed Public Review Issues
Date/Time: Sat Nov 24 00:06:26 CST 2012
Name: Masatoshi Kimura
Report Type: Error Report
Opt Subject: StandardizedVariants.txt contains forbidden variation sequences
According to TUS v6.1 clause 16.4, http://www.unicode.org/versions/Unicode6.2.0/ch16.pdf#page=15 > > The base character in a variation sequence is never a > > combining character or a decomposable character. However, the following base characters appearing in http://unicode.org/Public/6.2.0/ucd/StandardizedVariants.txt have a decomposition mapping. 203C => <compat> 0021 0021 2049 => <compat> 0021 003F 2139 => <font> 0069 24C2 => <circle> 004D 3297 => <circle> 795D 3299 => <circle> 79D8 1F21A => <square> 7121 1F22F => <square> 6307 Either clause 16.4 or StandardizedVariants.txt need to be updated to fix the inconsistency.
Ken Whistler responded on 2012/11/26:
We do have a textual problem here. This should be filed in the Other Feedback section of the feedback document for the next UTC. I think the fix is a one-word addition: "decomposable character" to "canonical decomposable character" in the last paragraph on p. 556 in Section 16.4. But we also should add text regarding the stability of variation sequences across normalization forms, to make the implications clearer. And in any case, this needs UTC review. Maybe add my suggestions here as part of the feedback, so we don't have to reconstruct the context from scratch at the meeting. --Ken
Date/Time: Tue Dec 18 17:42:16 CST 2012
Report Type: Error Report
Opt Subject: Name aliases in Namelist.txt
The normative name alias for FEFF (i.e. Byte Order Mark) is not printed with the same special symbol (reference mark) as the normative name aliases for other "misnomers". I believe this is due to the fact that (as the only such alias) it was classified as "alternate" rather than "correction". I suggest a simple fix: to use % for alternate aliases when printing the nameslist. This will affect precisely one character and would not affect any public data files except for the nameslist (which is not intended to be machine parseable).
Date/Time: Wed Jan 23 12:08:11 CST 2013
Name: Roger Costello
Report Type: Error Report
Opt Subject: Error in Unicode Technical Report #36
In the Unicode Technical Report #36, Unicode Security Considerations  it says: PEP 383 takes this approach. It enables lossless conversion to Unicode by converting all "unmappable" sequences to a sequence of one or more isolated high surrogate code points. That is, each unmappable byte's value is a code point whose value is 0xDC00 plus byte value. Notice "high surrogate" in that quote. I'm confused. I thought the low surrogate range started at 0xDC00, but this document is saying that 0xDC00 + byte value = high surrogate. Is that a typo in the document?
Date/Time: Wed Dec 19 17:53:07 CST 2012
Name: Markus Scherer
Report Type: Other Question, Problem, or Feedback
Opt Subject: noncharacters should not be treated like ill-formed text
I found that I cannot edit some of the CLDR files (CJK collation tailorings) with the Gnome Linux default editor gedit because those files contain the noncharacter U+FDD0 and gedit treats noncharacters like ill-formed byte sequences. Other editors don't seem to do this, but I am having trouble pointing to a piece of the standard or the web site that clearly states that noncharacters are "better than ill-formed sequences". We have a number of statements saying noncharacters should not be used in open interchange. In the standard, 16.7 Noncharacters even says "It is good practice, however, to recognize it as a noncharacter and to take appropriate action, such as replacing it with U+FFFD replacement character, to indicate the problem in the text." On the other hand, just a little further the standard says "In effect, noncharacters can be thought of as application-internal private-use code points." which is really how they are used in CLDR (for use in implementations of alphabetic indexes). Please add a statement to the effect that noncharacters are "better than ill- formed sequences", and please remove the "good practice" to replace noncharacters with U+FFFD.