Re: Superscript and Subscript Characters in General Use

From: Marcel Schneider <charupdate_at_orange.fr>
Date: Fri, 6 Jan 2017 20:02:25 +0100 (CET)

Another important point for the modifier letter fallbacks to work (if supported),
would be that fonts support diacritics combined with modifier small letters.
In 2014 I requested the superscript small 'è' (not noticing that the intended
abbreviation is incorrect), but encoding new characters like this one would be
useless because it is decomposable, and out of date since the deadline is long past.
But the superscript 'é' that Iʼve recently mentioned is still used (in 'S^{té}' for
'Société' [Corporation], different from 'Sᵗᵉ' which is the abbreviation of 'Sainte'
[Saint, feminine]); and in Spanish, superscript 'í' is used, Denis Jacquerye noted
while pointing the need of working with—and enhancing support of—higher level
protocols. [4]

Higher level protocols will still stay recommended as the standard high-end solution,
while the use of modifier letters could get the status of an alternate fallback.
Once it has it, modifier letter small q could be encoded and the whole set updated
at font level for support of combining diacritics, while software may add two commands
for round-trip conversion between modifier letters and superscript baseline letters,
and probably between preformatted fractions and formatted fractions; Iʼm quite sure
that all this is possible right now in VBA.

Iʼve added some more references to my previous mail with respect to past yearʼs
discussion of formatting variation selectors. As there was a typo and missing line
breaks (symptomatic of not using any spell checker and of editing the layout by
hand in a text editor), I feel the need of letting follow the corrected version
below.

Best regards,

Marcel

On Fri, 6 Jan 2017 00:21:29 -0800, Asmus Freytag wrote:
>
> On 1/5/2017 9:42 PM, Marcel Schneider wrote:
> >
> > Nevertheless,
> > the user might prioritize the stability of the document when it comes to plain text,
> > and he could be interested in a better-looking display of letters that elsewhere
> > should be superscripted. Here, the modifier letters could be a ready-to-use fallback
>
> The use of such hacks is destabilizing to any efforts to systematically format superscripts
> across a document.

That supposes a rich text environment. The orthographical correctness of some
languages, among which French, requires traditionally either a rich text environment
or some in-line markup like TeX (at the expense of direct usability, i.e. without
a LaTeX converter). That is limit non-conformant to the design principles of Unicode.
As I understand them, Unicode provides all characters that are needed to correctly
spell any language. This goal remains unreached as long as the orthography of some
languages cannot be entirely achieved without relying on formatting markup. (Iʼm
aware that complex scripts require hinted fonts for glyph reordering and glyph
substitution, but this still is plain text.)

The superscripting of abbreviation endings belongs to another level of correctness
than the arbitrary stress as expressed with italics, bold, underline (obsolete in
this use), extra letter spacing (German, rather old-style), capitalization, or
extra acute accents as in Dutch.

This is why Karl Pentzlin [1] cited ‘Biblio^{que}’ vs “Biblioque”, where the latter
is “no valid French word.”

From this it becomes now clear that Alastair Houghtonʼs [2] suggestion of encoding
a superscript variant selector, would meet this requirement and is therefore not
to be confused with the first step towards making Unicode support rich text. This
was indeed the traditional argument opposed to previous similar suggestions. [3]

Following the actual scheme, French and a few other languages cannot be written
in a correct orthography when the environment is plain text. That seems to me
hard to accept.

> Text fonts may not support them, because for "ordinary" text, by Unicode's
> recommendation, one would use ordinary letters / digits with superscript markup.

A text font that does not support all modifier letters has less of a text font than
of a title font. Ornamental fonts are produced in such a variety that completing
them is/was economically unfeasible. Iʼm considering this statement rather in the
past tense, because diacriticized letters are already (on request) automatically
generated and added to the font at creation. If automatic superscripting shouldnʼt
already be implemented, it will be soon, I suppose. So more and more (new and
updated) fonts will support them. But wherever they arenʼt, a _Convert modifier
letters to superscript_ feature (or an equivalent macro command) ought to be able
to make the text conformant to legacy handling.

> So, by using these hacks, anytime a document is re-formatted with a different font style,
> you are in danger of either losing these to boxes, or to be faced with random font styles.

Yes, people should always be aware that the use of modifier letters has its downside,
as has the use of superscripted baseline letters. I currently write e-mails (like
this one) in a text editor (Notepad++). Several features I use here, are IMO missing
in all e-mail clients, as column editing, line reordering, and so on. So I appreciate
to be able to spell correctly in plain text, without sloppy fallbacks (i.e. baseline
fallbacks for superscript). Itʼs a matter of making the most of the existing charset.
I believe that modifier letter fallbacks are very functional. When I paste them into
an HTML mail form, the display is always correct and doesnʼt need to add superscript
by hand in the whole mail. Furthermore, I can even use superscript in the subject.

> If you don't think that is a real problem: some (many) character pickers will insert font+code point into
> an application. These font bindings often survive and suddenly your text, when read on a different
> computer looks like a ransom note, just because the new machine has a new "default" font, and
> that is applied to all letters that don't have a specific font binding.

Basically this is a good scheme, because character pickers typically are used for
symbols. There are also two kinds: local, and online. I sometimes pick in the
full-size PDF of the Code Charts. Theyʼre the best character picker IMO.

> Some font pickers are "stupid" enough to do this for simple accented code points that would have
> been in the currently selected font anyway.

Thatʼs really bad. I know that some people are writing documents by picking accented
letters in the special characters dialog. I can figure out that some other people
may use an online picker instead, partly because the word processor theyʼre using
may be a web-app. Anyhow, this is very unefficient. The reason may be that one
often thinks either that a keyboard cannot be completed, or that completing a
keyboard would make it unusable, or hard to use, or full of stickers. Hereʼs one
main challenge of keyboard layout development.

> Your suggestions will just add to these problems.
> If editing in a rich text environment, work in rich text. And then lean on implementers to get
> export correct to other rich text formats....

I really worked nearly all the time in a rich text environment, and I added plenty
of autocorrections to speed up writing. Today, I work most of the time in plain
text. I donʼt use LaTeX, but I know that this is easily exported to many other
formats. PDF is a main target format. Most of the drawbacks start when the reader
wishes to copy-paste some lines of a (basically searchable) PDF either to rich text
or to plain text… but that is not the issue here.

I hope that my future recommendations will solve more problems than theyʼll create!

Marcel

[1] Karl Pentzlinʼs MODIFIER LETTER SMALL Q proposal:
http://www.unicode.org/L2/L2010/10230-modifier-q.pdf

[2] Alastair Houghtonʼs SUPERSCRIPT/SUBSCRIPT variant selectors suggestion:
http://www.unicode.org/mail-arch/unicode-ml/y2017-m01/0016.html

[3] Re: Why incomplete subscript/superscript alphabet ? a.lukyanov
http://www.unicode.org/mail-arch/unicode-ml/y2016-m10/0001.html
Re: Why incomplete subscript/superscript alphabet ? Leonardo Boiko
http://www.unicode.org/mail-arch/unicode-ml/y2016-m10/0013.html
Re: Why incomplete subscript/superscript alphabet ? Jukka K. Korpela
http://www.unicode.org/mail-arch/unicode-ml/y2016-m10/0014.html
Re: Why incomplete subscript/superscript alphabet ? Steve Swales
http://www.unicode.org/mail-arch/unicode-ml/y2016-m10/0015.html
Re: Why incomplete subscript/superscript alphabet ? Neil Harris
http://www.unicode.org/mail-arch/unicode-ml/y2016-m10/0017.html

[4] Re: Why incomplete subscript/superscript alphabet ? Denis Jacquerye
http://www.unicode.org/mail-arch/unicode-ml/y2016-m10/0037.html
Received on Fri Jan 06 2017 - 13:02:25 CST

This archive was generated by hypermail 2.2.0 : Fri Jan 06 2017 - 13:03:16 CST