Re: French Superscript Abbreviations Fit Plain Text Requirements

From: Asmus Freytag <>
Date: Wed, 28 Dec 2016 13:47:00 -0800
On 12/28/2016 7:25 AM, Marcel Schneider wrote:
Applied to the French abbreviation of “numéros” (numbers), that means that the 
abbreviationʼs final letters 'os' *must not* be formatted as superscript: Since 
“the extra information in rich text can always be stripped away to reveal the 
‘pure’ text underneath” (TUS, ibid.), 'n^{os}' would end up as 'nos' (“our”, 
with a plural noun). Consequently, best practice is to represent it using the 
Unicode superscript “modifier letters”: 'nᵒˢ'.

This is seriously overstating the plain text principle.

There are many places where formatting affects the reading (and not just the presentation) of the text. In some cases, it is appropriate to encode characters for that, in other places the conclusion is simply that plain text is not sufficient.

In English, superscript is used for ordinal numbers. The fallback without superscript tends to be functional, because of the alternation between digits and letters, but there's nothing "pure" about it.

Some sentences in English can be parsed ambiguously; the convention in print has been to italicize the word intended to take the stress. Here, the plain-text fallback is less functional, as it re-introduces the ambiguity.

There is no rule that says that *all* content information *must* be expressible on the plain text level. Some edge cases exist, where other layers, by necessity, participate.

Mathematical notation is a good example of such a mixed case: while ordinary variables can be expressed in plain text with the help of mathematical alphabets, the proper display of formulas requires markup. Even Murray Sargent's plain text math is markup, albeit a very clever one that re-uses conventions used for the inline presentation of mathematical expression. (Where that is insufficient, it introduces additional conventions, clearly extraneous to the content, and hence markup).

The encoding conventions (principles) chosen by Unicode stipulate that for ordinary text (not notations) any part of the content that requires alternate presentation (italics, superscript, etc) is to supplied via styles, not coded characters. In contrast, for scholarly or technical notation, that requirement is relaxed.

As long as French is ordinary text, the abbreviations require styled (rich) text.


Received on Wed Dec 28 2016 - 15:48:31 CST

This archive was generated by hypermail 2.2.0 : Wed Dec 28 2016 - 15:48:32 CST