Accumulated Feedback on PRI #228

This page is a compilation of formal public feedback received so far. See Feedback for further information on this issue, how to discuss it, and how to provide feedback.

Date/Time: Tue Jun 5 11:17:42 CDT 2012
Contact: cowan@ccil.org
Name: John Cowan
Report Type: Public Review Issue
Opt Subject: PRI #228 Changing some common characters from Punctuation to Symbol


I would be in favor of changing all the characters except HYPHEN-MINUS, 
which I believe is far more often used as a hyphen than as a minus sign, 
except in programming.  In particular, I think the argument that HYPHEN-MINUS 
is often used in names like "Emmeline Pethick-Lawrence" or "Jean-Pierre Rampal" 
is definitive.

I also considered the option of leaving the ASCII-repertoire characters alone 
for the sake of stability and changing the others, but that would pry apart 
the percent, per mille, and per ten thousand characters, which makes no 
sense to me.

Date/Time: Sun Jul 22 12:22:30 CDT 2012
Contact: timpart@perdix.demon.co.uk
Name: Timothy Partridge
Report Type: Public Review Issue
Opt Subject: PRI #228 Changing some common characters from Punctuation to Symbol


I have reservations about two of the characters.

The ampersand is a scribal abbreviation for the Latin word “et”, and its
translations into other languages, e.g.“and” In some texts has been used to
contract the appropriate letter sequence in the middle of a word. (U+A76B
LATIN SMALL LETTER ET might be intended to be used instead in that context).
In some orthographies of the Marshallese language it is used to represent a
vowel. The ampersand is accepted in company names in the UK. “&&& LIMITED”
“AND & CO RETAIL LIMITED”

Along with U+0027 HYPHEN-MINUS its usage in names counts against changing its
type to symbol and potentially causing some software to prohibit it.

Many of the characters listed are the basis for the categorisation of other
characters in the standard. Depending on which characters in the original list
are chosen to become symbols some of the following punctuation marks should be
considered to be changed into symbols as well for consistency.

U+2E36 DAGGER WITH LEFT GUARD
U+2E37 DAGGER WITH RIGHT GUARD
U+2E38 TURNED DAGGER
U+066A ARABIC PERCENT SIGN
U+FE6A SMALL PERCENT SIGN
U+FF05 FULLWIDTH PERCENT SIGN
U+0609 ARABIC-INDIC PER MILLE SIGN
U+060A ARABIC-INDIC PER TEN THOUSAND SIGN
U+FF03 FULLWIDTH NUMBER SIGN
U+204A TIRONIAN SIGN ET
U+FE60 SMALL AMPERSAND
U+FF06 FULLWIDTH AMPERSAND
U+FF0D FULLWIDTH HYPEN-MINUS
U+FE63 SMALL HYPHEN-MINUS


Date/Time: Mon Jul 23 12:40:58 CDT 2012
Contact: khw@cpan.org
Name: Karl Williamson
Report Type: Public Review Issue
Opt Subject: PRI #228 Changing some common characters from Punctuation to Symbol


I have reservations mainly about the Hyphen-Minus.  

But I do have a data point.  We at Perl have taken the proposed changes and
run them through CPAN regression tests. http://www.cpan.org/

CPAN consists of over 100K modules written in Perl.  As far as I know, there
was only one failure report.  I don't know what percentage of the modules have
been tested, but it is a non-trivial amount.  The people who would know best
are on vacation, and have been so for some time.

A reason for the lack of failures is that Perl started out with the Posix
punctuation definition, which includes both gc=S and gc=P, so changing from
one to the other of those would be transparent to most Perl programs which
would use the traditional definition.  The amount of regression tests vary
widely depending on the module.  But the take away message is, that as far as
Perl goes, this change is probably acceptable.

Date/Time: Wed Jul 25 17:27:48 CDT 2012
Contact: markus.icu@gmail.com
Name: Markus Scherer
Report Type: Public Review Issue
Opt Subject: PRI #228: P->S, should include data files


I think it would be best to publish for review a complete set of UCD files
for the proposed changes (including derived & test files as well as UCA &
CLDR root collation), without any other changes from a released version. That
would give implementers a better way to evaluate the changes by plugging the
modified data into their implementations.

 


Feedback above this line was reviewed at UTC #133, November 2012.

Date/Time: Wed Jan 2 17:47:18 CST 2013
Contact: markus.icu@gmail.com
Name: Markus Scherer
Report Type: Public Review Issue
Opt Subject: PRI #228 Changing chars from P to S vs. collation


One of the differences between Punctuation and Symbols is in the default 
collation order, and these characters should move in the DUCET if they 
change General_Category.

In particular, sorting and searching with "ignore punctuation" settings, 
for example with "alternate=shifted" in CLDR/ICU, will ignore punctuation 
characters but not symbols. Reviewers should take this into account.