Comments on L2/22-230 "Mathematical notation profile for default identifiers"

Date/Time: Tue Oct 25 17:03:27 CDT 2022
Name: Kent Karlsson
Report Type: Other Document Submission
Opt Subject: 22230-math-profile.pdf

Comments on https://www.unicode.org/L2/L2022/22230-math-profile.pdf:
-------

1. “Math_Continue ≔ Math_Start ∪ [⁽ ₍ ⁾ ₎⁺₊⁼₌⁻₋⁰₀¹₁²₂³₃⁴₄⁵₅⁶₆⁷₇⁸₈⁹₉]”

This is an very bad idea, “math profile” or not, because it allows
identifiers that look like expressions to be evaluated. Using that would be
very confusing, and thus a very bad idea. If a programming language were to
allow this, “math profile” or not, this feature should be quickly
deprecated, and, deprecated or not, auxiliary programming style checkers
(commonly used in software projects) implement checks to disallow such
identifiers.

As a side note, I would think that programming language standardisers
(and implementors) would not look kindly on a notion of “profile” for
identifiers... Users (programmers and software companies, esp. the people
considering coding style) certainly will not look kindly upon it.

That said, I’m more sympathetic to the subproposal of using “pre-subscripted
digits” (just the digits, and just the subscript ones) at the END of
identifiers (only at the end, NOT in the middle). More generally one would
use arrays or array-like constructs for general indexing, but for simple
cases, using pre-subscripted digits would do fine. Subscripts are often
used to “just” do indexing, not some other mathematical operation, and one
can regard that as part of the name if the index is just an explicit
(not computed) literal (small) natural number. So, “Math_Optional_End ≔
[₀₁₂₃₄₅₆₇₈₉]” (with associated changes) or even “Math_Optional_End ≔
[₀₁₂₃₄₅₆₇₈₉]+” (also without any “profile”), or similar, could work, but
multiple subscript digits in a sequence might not look good in a
fixed-width font... (I’m sure you can massage this in, without me having to
give the details here.)

--------

2. “the expressions do not usually represent multiple terms; or if they do,
they need to resort to a heavily degraded representation, such as log1p
(x) for log(1+x) in IEEE 754”

The reason for log1p in IEEE 754 (and ISO/IEC 10967-2, LIA-2) has absolutely
nothing to do with a “need to resort to a heavily degraded representation”.
It has to do with properties of floating-point numbers (as opposed
to “real” numbers in math), to allow higher accuracy (in floating-point)
when using log1p(x) compared to using log(1+x) in a numerical program. That
is, for values of x close to 0, the returned result is highly accurate for
log1p(x) and variants, max 0.5 ulp error (if implemented properly; 0 ulp
error, for natural logarithm) for x close to or in the “underflow area”,
whereas log(1+x) will return a very inaccurate 0, up to several billion
ulps relative to x (not relative to 1+x, of course), except for x being 0
itself. (ulp, units in the last place, is a relative error, not an absolute
error, so “several billion ulp” might not matter, depending on the rest of
the computation.) Ok, maybe that was too much about floating-point in this
context... Now, these numerical functions (log1p and similar) will need a
naming. And a naming like “log⁺¹” (assuming that this will survive comment
submission, and not normalized away or rejected as “forbidden”) would be
highly confusing and totally inappropriate. See point 1.

--------