Re: PDUTR #25: Unicode Support for Mathematics

From: Asmus Freytag (asmusf@ix.netcom.com)
Date: Fri Dec 28 2001 - 21:58:07 EST


At 12:34 AM 12/28/01 -0600, starner@okstate.edu wrote:
>If you want to define text/math, and provide the disappearing parenthesis
>and precedence tables and everything, then that's fine, but I don't see
>why it should be part of Unicode, anymore than full music rendering is part
>of Unicode. It's a higher level protocol. IMO, section 5 should not be part
>of a Unicode draft report for that reason.

This opinion is shared by others and the best place for the information in
section 5 will certainly be discussed when the current *proposed* draft is
being reviewed for advancement *draft* status. In the meantime, I'd like to
comment on your analogy.

The analogies to full music rendering are not as close as they appear at
first glance. If you look at a mathematical, scientific or technical paper
you will find substantial amount of mathematical notation appearing as part
of ordinary text lines, including, at times, headings and titles, in other
words, strings that often are part of databases. (**)

The same is not true for music laid out on staff.

Secondly, a large subset of mathematical formulae can be expressed very
directly in a linear, text-like fashion, even though the remainder do
require fairly heavyweight markup to display correctly.

Again, this is not strictly analogous to musical notation.

The convention proposed in section 5 is clearly a lightweight markup
protocol. The disappearing parens in themselves are borderline in that
regard - in fact the mechanism is not far removed from some of the complex
script cases, where characters may or may not be invisible depending on
context. And giving operators some properties is not so remvoced from
FRACTION SLASH. However, to be workable, the proposed convention needs
subscript and superscript operators, and ultimately a convention of
applying certain "decorations" (limits, as well as combining accents) to
both individual characters and groups of characters. These are the aspects
that most clearly appear to cross the line into the realm of markup protocols.

On the other hand, the proposed 'markup' itself consists of the kinds of
things that one would use in a plain text fallback, e.g. when communicating
an equation by e-mail. One can conclude that the proposed convention is a
'renderable plain text fallback' that happens to cover a large subset of
commonly used mathematical notation. It is therefore a very different beast
from MathML, which is a full-fledged markup protocol, able to cover
practically everything and only barely human readable in source form.

As such it occupies a novel middle ground between the plain Unicode (with
script rules) and full-fledged markup schemes.

A./

PS: Disclaimer: while I am a co-author of the TR, the credits for inventing
the scheme described in section 5 belong entirely to Murray Sargent, who
will undoubtably have his own things to say about it.

PPS: I did not elaborate on why the fact (marked with (**) above) that
limited amounts of mathematical notation end up in database strings is
significant. The reason is that such strings are ultimately plain text that
need to be rendered in the absence of heavy duty markup protocols. A
convention that implements its own plain-text fallback has great advantages.



This archive was generated by hypermail 2.1.2 : Sat Dec 29 2001 - 04:57:06 EST