**From:** Asmus Freytag (*asmusf@ix.netcom.com*)

**Date:** Thu Apr 14 2011 - 10:53:34 CDT

**Previous message:**Doug Ewell: "Re: math alphabets, WAS: Proprietary Card Decks"**In reply to:**Doug Ewell: "Re: math alphabets, WAS: Proprietary Card Decks"**Next in thread:**Hans Aberg: "Re: math alphabets, WAS: Proprietary Card Decks"**Messages sorted by:**[ date ] [ thread ] [ subject ] [ author ] [ attachment ]**Mail actions:**[ respond to this message ] [ mail a new topic ]

On 4/14/2011 7:13 AM, Doug Ewell wrote:

*> Hans Aberg<haberg dash 1 at telia dot com> wrote:
*

*>
*

*>>>> Unicode does not have characters for say superscripts and
*

*>>>> subscripts, which are essential to math. My guess it would be too
*

*>>>> complicated to require it for current text-only renderers, but in
*

*>>>> the future that might change.
*

*>>>>
*

*>>> No, because in math, superscript is not a character attribute but a
*

*>>> formatting style that is applied to any term or formula and that can
*

*>>> be fully (and infinitely) nested.
*

*>>>
*

*>>> That abstraction is better handled in markup than in plain text.
*

*>>> (Unlike the mathalphanumerics, such markup is still independent of
*

*>>> the font).
*

*>> That is so in rendering programs like TeX, because one does not enter
*

*>> the math so that it can be parsed semantically. One enters
*

*>> superscripts how they should be rendered and not by the logical
*

*>> structure of the formula.
*

*>>
*

*>> That is different if one has say a theorem prover. Then one can enter
*

*>> a formula, let the program parse it into an AST, and from that infer
*

*>> how it should be rendered, for example, where to put parenthesizes.
*

*> I don't follow this. Asmus' point is that superscript can be applied,
*

*> not only to any arbitrary character that can be used in a math
*

*> expression, but also at any arbitrary level of nesting. After Unicode
*

*> has finished adding superscript versions of every imaginable math
*

*> character, including all of the math alphanumerics, it would then have
*

*> to add second-level, third-level, etc. versions of every character, so
*

*> that one could enter "a to the b to the c to the (d times square root of
*

*> 2)" in plain text. And don't forget subscripts of superscripts, and
*

*> vice versa.
*

*>
*

*> A counterargument that this is going too far, that Unicode wouldn't need
*

*> to encode arbitrary levels of superscript/subscript nesting but only
*

*> one, is basically an agreement with Asmus that this problem is best
*

*> solved by (semantic) markup.
*

*>
*

*>
*

The distinction between "semantic" and "presentation" markup is an

important one, here. This is a distinction that is figuring prominently

in the design discussions for HTML5, for example, but it has not been

dealt with very explicitly, up to now, in discussions of character encoding.

In character encoding, all markup is implicitly supposed to be

presentational, with the semantics represented in the plain text layer.

If that simple model were appropriate in all circumstances, then any

time you need any markup at all, you have "rich text" and if rich text

is already required, why would anyone want to encode a distinction in

plain text.

However, this assumes that all markup is presentational. In the example

for mathematical notation we see that Unicode encodes characters for

those distinctions that would require presentational markup (appearance

of symbols), while not encoding characters for distinctions that require

semantic markup (scoping of expressions, nesting of expressions,

including super/subscript). Another way to look at that would conclude

that in mathematical notation the "atoms" include elements that would be

styled (presentational) in regular text context.

In phonetic notations (except some of the odd cases recently introduced)

super and subscript are atomic in this sense and not presentational.

However, where super and subscripts become expressions (with parens or

slashes), then the question needs to be asked (and is being asked)

whether these aspect of phonetic notations shouldn't best be represented

with semantic markup.

We are familiar with user interfaces that present "bold", "italic" etc.

as attributes of characters, when typographically, these are really

separate fonts (albeit conceived in concert with the regular font).

Viewed in that way, the distinction between bold and italic forms and

black letter, openface, sans-serif and monospace forms is simply a

matter of degree and convention. All of these variations require font

selection. Font selection is the ultimate in presentational markup.

You could say that encoding the mathematical alphanumerics means that

you can create mathematical text where one doesn't need font-selection

to carry the semantics of a document, while you still need semantic

markup. In particular, one doesn't need font-selection at the level of

individual "atoms" of the notation.

Super and supscript are a combination of relative positioning and size,

and as Doug and I are pointing out here, this positioning applies in

principle to the whole expression (whether it's a single variable or a

fraction or a more complicated expression). Positioning in mathematical

notation already requires markup for scoping, hence singling out

super/subscript isn't adding anything useful.

In conclusion, the lesson learned is that the simplistic character-glyph

model, which recognizes only semantic plain text and presentation markup

needs to be extended to include a hybrid model where atomic semantics

are present in the plain text layer, scoped semantics are present in

semantic markup and presentational markup isn't required to carry any of

the semantic information.This model is characterized by the property

that it does not require markup (such as entity markup) for the

representation of atomic entities, and that presentational markup can be

applied in ways that are clearly separate from the semantics (e.g.

choice of a particular Fraktur/Black letter font to render generic

"black letter" symbols).

A./

**Next message:**Hans Aberg: "Re: math alphabets, WAS: Proprietary Card Decks"**Previous message:**Doug Ewell: "Re: math alphabets, WAS: Proprietary Card Decks"**In reply to:**Doug Ewell: "Re: math alphabets, WAS: Proprietary Card Decks"**Next in thread:**Hans Aberg: "Re: math alphabets, WAS: Proprietary Card Decks"**Messages sorted by:**[ date ] [ thread ] [ subject ] [ author ] [ attachment ]**Mail actions:**[ respond to this message ] [ mail a new topic ]

*
This archive was generated by hypermail 2.1.5
: Thu Apr 14 2011 - 10:56:53 CDT
*