Re: Superscript and Subscript Characters in General Use

From: Marcel Schneider <charupdate_at_orange.fr>
Date: Wed, 4 Jan 2017 15:20:40 +0100 (CET)

On Wed, 4 Jan 2017 00:36:38 -0500, Asmus Freytag wrote:
>
> On 1/3/2017 4:24 PM, Marcel Schneider wrote:
> > On Tue, 3 Jan 2017 09:31:42 +0100, Christoph Päper wrote:
> >
> >>> Among the possibilities, you include Unicode subscripts.
> >> Just for the sake of completeness.
> > This tends to conclude that preformatted subscripts are really an option here.
>
> Not so. You yourself quote this statement:
>
> | Superscript modifier letters are intended for cases where the letters carry
> | a specific meaning, as in phonetic transcription systems, and are not
> | a substitute for generic styling mechanisms for superscripting of text,
> | as for footnotes, mathematical and chemical expressions, and the like.
>
> It is clear that the uses that you advocate go against this intent.

This is because even complemented with UAXes and TRs, the Core Specifications
cannot cover the whole practice. It seems that to stay inside reasonable limits,
a significant number of usage cases have been left out, e.g. the mentioned use of
plain text for styled custom vulgar fractions is a recognized practice, but stays
persistently excluded from TUS. However, since the inclusion of this could consist
in adding three lines to the text, there is more to it. Out of technical as well
as ethical considerations, Unicode is unable to promote the discussed usages, but
without strongly discouraging them. The snippet above [1] would be less harsh at
the expense of some redundancy:

| Superscript modifier letters are intended for cases where the letters carry
| a specific meaning, as in phonetic transcription systems, and are not INTENDED
| AS a substitute for generic styling mechanisms for superscripting of text,
| as for footnotes, mathematical and chemical expressions, and the like.

This resolves to the meaning that super-/subscripting in more or less ordinary
text is outside the design principles of the Unicode Standard, because the
boundary between the feasible and the unfeasible would be hard to draw, as shown
with the recent example of the plain text database for chemical formulas. So to
protect itself against the temptation of drawing that boundary (drawing it at risk
of being subsequently compelled to move it further), Unicode *declares* those
characters as being *intended for* special contexts, according to their very
encoding history.

Trying to understand to what extent this principle is applicable, I note that
the three cited examples currently imply much more formatting than superscripting.
This is the case of structural formulae in _chemistry_, complex _mathematical_
expressions, and _footnote_ management and layout. By contrast, when itʼs only
about super- or subscripting a few digits or Latin letters, markup and use
of rich text may be considered overkill. And in the case of content that the
reader may wish to copy-paste, things like the “16” affix of hex numbers should
remain distinct. Hence, styling is only “the preferred means”, not the mandatory
way to represent superscript letters or digits.[2] And this is tied to a /design/
principle of the Standard. I believe that /usage/ principles may diverge.

>
> Therefore, your conclusion that this is "an option" is nothing more than
> a very personal opinion on your part (and one that many people here would
> consider misguided if presented as general recommendation).

Presenting this as general recommendation was indeed what I intended when starting
the first thread of this discussion. Thanks to your and other subscribersʼ replies,
Iʼve come to the insight that this cannot be recommended throughout, not in a
general way. However, this not being "an option" remains still very unclear to me.
As a result of prior discussions, we know that other list participants do use e.g.
superscript characters in a more extensive way.

I think there are two levels of action:

(1) to encode new preformatted characters;
(2) to encourage re-use of already existing ones.

I understand that Unicode is consistently reluctant in both, while ISO/IEC is able
to do more in (1) given that they sometimes add (or remove) characters to(/from)
the new repertoire, and National Bodies are in a position to do (2) through usage
recommendations of their own. Let alone all the other people who may use or not
use available preformatted characters for any purpose, eventually sharing the hint
and—in the best case—the means to input them efficiently.

Or am I missing something?

Given that the WG of the French standard keyboard is actually interested in getting
encoded a new ordinal indicator (kind of 'ᵉ'), I feel the more urged to stay tuned,
and to comment on subsequent e-mails, too.

Marcel

[1] TUS 9.0, §7.8, p. 327.
http://www.unicode.org/versions/Unicode9.0.0/ch07.pdf#G24762

[2] TUS 9.0, §22.4, p. 786.
http://www.unicode.org/versions/Unicode9.0.0/ch22.pdf#G42931
Received on Wed Jan 04 2017 - 08:21:06 CST

This archive was generated by hypermail 2.2.0 : Wed Jan 04 2017 - 08:21:06 CST