Public Review Issues

Accumulated Feedback on PRI #427

This page is a compilation of formal public feedback received so far. See Feedback for further information on this issue, how to discuss it, and how to provide feedback.

Date/Time: Sun Mar 21 16:48:51 CDT 2021
Name: David Corbett
Report Type: Public Review Issue
Opt Subject: PRI #427: Misuse of subscripts

Note: This feedback has been reviewed and changes are reflected in revision 22, draft 3.


The proposed update introduces the notations ∁ₛ and ∁ₚ for the complements
of 𝕊 and ℙ, using subscript plain lowercase letters in place of subscript
double-struck capital letters. This use of U+209B and U+209A goes against
Unicode’s general principle of subscripts, as described in section 22.4,
that “style or markup in rich text” is preferred when possible, except in
phonetic alphabets. Because UTS #18 is written in HTML, it should use
`∁<sub>𝕊</sub>` and `∁<sub>ℙ</sub>`.

Date/Time: Wed Jun 16 23:40:43 CDT 2021
Name: Wang Yifan
Report Type: Error Report
Opt Subject: PRI #427: Examples out of line with UCD

This feedback has been directed to the UTC and there are actions to make changes.

Some examples currently given in UTS #18 seem to have been either 
wrong or outdated.

In the table showing expressions related to hiragana under Section 1.2.6:

Expression   | Contents of Set
\p{sc=Hira}  | [ぁ-ゖゝ-ゟ𛀁🈀]
\p{scx=Hira} | [、-〃〆〈-】〓-〟〰-〵〷〼-〿ぁ-ゖ ゙-゠・ー㆐-㆟㇀-㇣㈠-㉃㊀-㊰㋀-㋋㍘-㍰ ㍻-㍿㏠-㏾﹅﹆｡-･ｰﾞﾟ𛀁🈀]

But neither line reflects the current state of set in U13.0 or the 
proposed U14.0. Moreover, it contains some unneeded spaces.

They should look like (I'm just writing manually; please generate from 
data files for accuracy):

Expression   | Contents of Set
\p{sc=Hira}  | [ぁ-ゖゝ-ゟ𛀁-𛄞𛅐-𛅒🈀]
\p{scx=Hira} | [、-〃〈-】〓-〟〰-〵〷〼〽ぁ-ゖ゙-゠・ー﹅﹆｡-･ｰﾞﾟ𛀁-𛄞𛅐-𛅒🈀]

Also, the second (currently first) table under Section 1.1:

Syntax                       | Matches
[\u{3040}-\u{309F} \u{30FC}] | Hiragana characters, plus prolonged sound sign

The description is not enough accurate as well as misleading as of today. 
It should say "Hiragana block code points" instead of "Hiragana characters" 
for maximal accuracy.

Though I only spotted issues around Hiragana because it caught sight of me 
intuitively, there could be more examples needing maintenance.

Date/Time: Thu Dec 16 22:29:51 CST 2021
Name: Karl Williamson
Report Type: Error Report
Opt Subject: UTS 18

U+0F33 TIBETAN DIGIT HALF ZERO has a numeric value of -0.5.  (I believe the 
existence of this character in the wild is apocryphal however.) There is no rule 
against other code points becoming encoded with a negative value.

However, UTS 18 says the hyphen-minus sign is supposed to be ignored within \p{} 
constructs, leaving no way to legally specify negative values.

I suspect that UTS 18 should be clarified to indicate that the hyphen minus at 
the beginning of a number should not be ignored, even with loose matching.  
But then what to do about two in a row?

Date/Time: Tue Jan 18 22:34:08 CST 2022
Name: Norbert Lindenberg
Report Type: Error Report
Opt Subject: UTS 18: Unicode Regular Expressions

In a table showing "current examples of escape syntax for Unicode code 
points", UTS 18: Unicode Regular Expressions shows "\uD83D\uDC7D" in the 
row that includes JavaScript. This example is no longer current: EcmaScript 
2015 introduced the \u{xxxxxx} escape syntax for Unicode code points, so 
that "\u{1F47D}" now works.

Date/Time: Sat Jan 22 04:46:41 CST 2022
Name: Ivan Panchenko
Report Type: Error Report
Opt Subject: UTS #18

UTS #18 contains the following mistakes:

“expressions.The” (instead of “expressions. The”),
“Database[UAX44]” (instead of “Database [UAX44]”),
“three character” (instead of “three characters”),
“"False" In” (instead of “"False". In”),
“Name values, must” (instead of “Name values must”),
“the the” (instead of “the”),
“see see” (instead of “see”),
“does offers” (instead of “does offer”),
“Equivalents .” (instead of “Equivalents.”),
“(see NameAliases.txt).In” (instead of “(see NameAliases.txt). In”),
“Properties .” (instead of “Properties.”).

An out-of-place closing parenthesis is found here: “[Perl].)” (instead of “[Perl].”).

Finally, a closing parenthesis is missing after “COMBINING DIAERESIS”.