Accumulated Feedback on PRI #277

This page is a compilation of formal public feedback received so far. See Feedback for further information on this issue, how to discuss it, and how to provide feedback.

Date/Time: Tue Jul 8 13:41:52 CDT 2014
Name: Karl Williamson
Report Type: Error Report
Opt Subject: PRI 277 Reconciling Script and Script_Extensions Character Properties

scx gives better results than sc.  Any application using sc and not scx
therefore has higher priority constraint(s) than giving the most accurate
results possible.  It doesn't matter what the cause.  It could be that the
maintainers have chosen not to keep up with TUS, or that it's too costly to
change an application originally written to use sc; or that the developers of
a new application decided it is too hard to cope with the multiple script
values that scx can evaluate too.  All these mean that there is something
higher priority for those applications than giving the most accurate results
possible.

Therefore, we do not need to be as concerned with such applications.  Their
highest priority is not the best results, ours need not be either.

I favor alternative A, as it gives a better API for applications that do want
the best results, and keeps Unicode out of the business of choosing which is
the best fit default.

Date/Time: Thu Jul 10 03:04:44 CDT 2014
Name: Nattapong Sirilappanich
Report Type: Public Review Issue
Opt Subject: Feedback for 277 Reconciling Script and Script_Extensions Character Properties

I prefer policy B

Date/Time: Wed Jul 23 23:26:13 CDT 2014
Name: Roozbeh Pournader
Report Type: Public Review Issue
Opt Subject: PRI 277 issues

We are using both the Script and Script_Extensions properties in various
projects. We make assumptions in our code that the Policy A is not held: we
use the Script property of characters with |Script_Extensions|>1 and
Script∉{Common, Inherited} as a hint to resolve the character's script in
cases where not enough context exists for deciding what font or shaping module
to use for rendering a certain character or sequence.

Adopting policy B would also open up the opportunity to move more characters
away from Script∈{Common, Inherited} and assign them explicit Script values.
For example, U+0485 COMBINING CYRILLIC DASIA PNEUMATA and COMBINING CYRILLIC
PSILI PNEUMATA, which currently have Script=Inherited and Script_Extensions =
{Latn, Cyrl} could change to Script=Cyrl to hint toward a Cyrillic preference
when the character is used out of context, like over a non-breaking space or
dotted circle, while Arabic-Indic digits which are used in Thaana only very
rarely (but are specified as Script=Common because of the Thaana usage) can
become Script=Arabic with minimal effect on Thaana users.

Also note that the algorithms suggested at the note at the end of the PRI
("processing Script_Extensions to choose the Recommended Script from that
value set, if there is exactly one; or picking among the scripts of the
implementation-supported languages"), while useful, is not sufficient. They
are hard to implement without each implementation trying to gather its own
per-character data, which leads to extra costs and incompatibilities. When the
Unicode Consortium has access to such data, it should provide them to the
public through the properties.

The Script and Script_Extensions property of characters should not be
considered very stable, and new information provided to the UTC should be used
to drive improvements to their values. This would improve rendering systems,
both at font design time (by knowing which scripts use which characters) and
at rendering run time (figuring out which font to use for each character).

[PS: It would have been useful for the Script_Extensions property to provide a
[sorted or partially sorted list, instead of an unordered set, but that
[information is very hard to resolve and maintain properly, if not
[structurally hard to specify and implement.]

Date/Time: Fri Sep 26 14:44:31 CDT 2014
Name: Jim Jewett
Report Type: Public Review Issue
Opt Subject: PRI #277

I prefer policy B, and Roozbeh Pournader has largely explained why.

I am sympathetic to Karl Williamson's point that applications using sc and not 
scx have some priorities higher than maximum correctness, but I think that is 
better dealt with by tailoring the conformance criteria.  For example:

Implementations MAY choose to partially support script extensions, by treating 
characters with a multi-valued script extension property as though the script 
property were "Common".