Re: Superscript asterisk

From: Ricardo Bermell-Benet (rbermell@aimplas.es)
Date: Wed Jun 30 1999 - 03:53:21 EDT


Torsten Mohrin has given to me the perfect reply
to my question, a Unicode Standard citation:

"Superscripts and subscripts have been included in the Unicode
Standard only to provide compatibility with existing character
sets. In general, the Unicode character encoding does not attempt
to describe the positioning of a character above or below the
baseline in typographical layout."

It makes sense, although it hurts me and my little asterisk.
To place you in context, my interest about the "superscript
asterisk" is addressed towards plain text mathematics.

Now, I would like to act as devil advocate, so please unmount
my following arguments.

First, let me rename the proposed symbol as REGULAR ASTERISK
   instead of SUPERSCRIPT ASTERISK, in acordance to my next
   point (and as a trick to reduce the psychological rejection;)

Second, and main, (observe the broken parallelism with the upper
   Unicode Standard citation) when i propose REGULAR ASTERISK
   i don't attempt to describe the positioning of the ASCII
   ASTERISK character, but to describe a new character, "new"
   since it has different application and meaning.

   Remember the Unicode slogan: a gliph does not define a character;
   REGULAR ASTERISK character is not necessarily the same as ASCII
   ASTERISK character only because of their similarity in form.
   So applying the "superscript rejection rule" perhaps is not so
   direct.

   See, for example, how DIGIT TWO and SUPERSCRIPT TWO are a
   different case. The character DIGITAL TWO is related to a well
   defined mathematical concept (a natural number), a precise
   meaning for a symbol, so good justification for a character.

   The SUPERSCRIPT TWO relates to the same well defined mathematical
   concept, which does not vary when the gliph stands in a different
   position or has a different size. So it does not merits a new
   character (except for backwards compatibility).

   The same applies to the rest of digits and to all the letters,
   including greek-math letters. I think this observation prevents
   against Kenneth Whistler's argument:
   "... Otherwise there would be no end to it: for example, any
   math italic variable name can be used as a superscript; likewise
   any Greek letter, and so on."

Third, i want to make patent an incoherence:

   ASTERISK OPERATOR was accepted as a new different character,
   in spite of its *identical* form to ASCII ASTERISK, because it
   denotes a different meaning.

   REGULAR ASTERISK, on the contrary, cannot be accepted as a new
   different character, although it denotes a different meaning,
   because of its *similar* form to ASCII ASTERISK.

   That is, REGULAR ASTERISK cannot be accepted as a new character,
   in spite of the fact that its distinction in form respect to
   ASCII ASTERISK gives, for the distinction as separated meaning-
   characters, better support than identity of form does for ASTERISK
   OPERATOR.

Now see this trap:

  The weakness of the second point is that, as well as ASCII
ASTERISK does not have a precise meaning (since it is like a
gadget for multiple uses), so weaker will be the differentiation
with respect to REGULAR ASTERISK (and more sense to apply
the "superscript rejection rule").

  The counter-argument: as well as second point weakness is greater,
so greater will be the ASTERISK OPERATOR incoherence. (As well as
ASCII ASTERISK does not have a precise meaning, so more precarious
will be the "ASTERISK OPERATOR versus ASCII ASTERISK" distinction)

REGULAR ASTERISK definition
---------------------------

To stand more precise, following is the meaning I intend
for REGULAR ASTERISK.

1) As mathematical symbol in regular expressions, denoting
   "the marked (left-adjacent) symbol or expression may be
   replicated an arbitrary number of times, including zero
   times"

   * For example, let caret (^) stand for REGULAR ASTERISK
     in the following regular expression:
        0^(1)
     this regular expression denotes the set of expressions
        {1, 01, 001, 0001, ... }

The suggested, but not definitory nor mandatory, form for
REGULAR ASTERISK is that of an asterisk in the position and
size of a superscript. A good reference for imitating is
the position and size of character PRIME.

For better separation of form and meaning, you can propose
another names without using the word "ASTERISK", e.g.
"REGULAR EXPRESSION MARK FOR ZERO OR MORE REPLICATIONS"
(perhaps too large, but you catch what i mean)

Bibliography for the above mentioned meaning of REGULAR ASTERISK:
   Hopcroft & Ullman [1979]
   "Introduction to Automata theory, languages, and computation"
   Addison-Wesley
Open at random any page of that book, all guesses are you'll find
that little charming asterisk.

And more: the water proof for pseudo-superscripts.
-------------------------------------------------

Take a pair (base symbol, suspicious superscript),
   * change the name of the suspicious for not to disturb the probe
     (e.g. don't call it superscript),
   * change the form of the suspicious and
   * compare the relation of meanings between the two symbols,
     wondering if the primitive relation of meanings can be
     already supported

Example (a)
   Take LETTER A and (imaginary) SUPERSCRIPT LETTER A
   * name the second as CURIOUS THING
   * change the form of CURIOUS THING to, say, a little white
     square
   * can CURIOUS THING, in its form of little white square,
     mantain its relation with LETTER A?
     No! The CURIOUS THING would << need a "connection in form" >>
     towards his parent LETTER A to have a related personality.

     Note that if we also change the form of LETTER A to be
     a white square, then we can assign the same personality
     to both simbols (although in bizarre forms); they would
     be BIZARRE LETTER A and SUPERSCRIPT BIZARRE LETTER A,
     but could mantain the binary relation "to be, both, LETTER A".

Example (b)
   Take DIGIT TWO and SUPERSCRIPT TWO
   * name the second as PRETTY MATTER
   * change the form of PRETTY MATTER to, say, two vertival little
     bars
   * the two little bars already can express the mathematical concept
     "natural number two", but the connection of meaning with DIGIT
     TWO is lost (or at least it's very poorly expressed);
     These symbols << need a connection in form >>, since whithout
     that connection we feel that "there is an incoherence in using
     different metaphors (forms) for the same concept"

Example (c)
   Take ASCII ASTERISK and REGULAR ASTERISK
   * name the second as REGULAR EXPRESSION MARK FOR ZERO OR MORE
     REPLICATIONS
   * change the form of "REGULAR EXPR..." to, say, a caret (^)
   * then note that, in the same context, both conventional
     ASCII ASTERISK and the new "REGULAR EXPR..." can be used with
     the intended meanings << whithout needing a connection in
     form >>, namely they have different and not related meanings.
     Since the original connection was only about forms, not about
     meanings, the separation of forms doesn't hurt.
     Please, look above in this letter and you will see
     i have just done that (utilized (*) and (^)) in the
     definition (1) of REGULAR ASTERISK
      

Expanding the uses/sense of REGULAR ASTERISK.
---------------------------------------------

   As well as PLUS character encounters uses that doesn't express
   "numeric addition" but that support the spirit of "addition"
   (e.g. string concatenation), we can encounter, for
   REGULAR ASTERISK, uses that support the spirit of "REGULAR
   EXPRESSION MARK FOR ZERO OR MORE REPLICATIONS". The following
   is an example of those uses, the second sense i give for
   REGULAR ASTERISK

2) As PRIME-like qualifier, denoting that "the marked (left-
   adjacent) variable (variable in mathematical sense), can be
   instanciated as some mathematical constant object (e.g. a list)
   which has a character of multiplicity, namely multiplicity
   with "zero or more replications" (e.g. a "possibly empty" list)

   Note that the former sense (1) works with constants, while this
   second works with variables.
   Note also that the "spirit" is mantained, but there could exist
   some sense where this "spirit" is relaxated (for example, the
   condition "zero or more replications" could be ignored).
   I think these are benign cases.

Coming back to my motivation:
-----------------------------

(Only a point now) Note that, for plain text mathematics, absence
of stylus for superscripts can be easily solved with expressions
like (N^2). But what for asterisk? (N^*)? That solution only works
with numbers and letters. Is there any substitute?

A lot of plain text mathematics can be done with Unicode symbols.
If it counts for something my humble opinion as computer engineer
biased towards mathematics, i think the Unicode Consortium
has really done a good job.

I think it would be fool to assemble ad infinitum mathematical
Unicode characters. Of course, it would be great and funny to have
a big repertoire of mathematical symbols, i don't deny that.
But, at least at this moment, i'm worried about a *minimal*
workset for plain text mathematics. Is in that direction
when i found the REGULAR ASTERISK as a necessity (unless you
convince me onto the contrary :-)

Well, that's all i wanted to say (at the moment...)
Nice to be in this list.

Ricardo Bermell-Benet <rbermell@aimplas.es>

   
 



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:47 EDT