L2/01-064 From: Patrick D. F. Ion [mailto:ion@ams.org] Sent: Wednesday, January 24, 2001 3:57 PM Subject: Mathematical Variant Symbols Dear Unicoders, MathML has what could be a serious problem, and Unicode has the means to obviate it in a way that I hope is in full accord with the principles that have already been applied in doing so much for the representation of mathematical and scientific works. The problem is a handful of math symbols that were not accepted in the additional math set, because they were held to be variants of others already accepted. That is a perfectly commmon-place determination the UTC has to make all the time. The symbols in question are some LONG forms of arrows, which are certainly commonly used in mathematical papers (the shorter forms would tend to be used in connections with limits and the longer with mappings, for instance). The problem arises because their use is so common that they were assigned entities in the ISOAMSA entity set, one of the several mathematical symbol entity sets ISO defined. MathML is an XML application, and MathML is intended to support the full legacy collection of the ISO math entity sets. This is at least in part because publishers who use forms of the ISO 12087 standards for mathematics would reasonably expect it to. The STIX project of the STIPUB group of publishers naturally included those ISO sets in its collection of mathematical characters from which were drawn the big list submitted to the UTC for consideration of possible additions. Now, however, with the dust almost completely settled over UCS characters for mathematics, in a way that will be a boon to the community, the absence of the long arrows poses MathML a problem. There are several conceivable solutions, in increasing order of desirability: 0. Ignore the problem [unprincipled, and it would just bite us later] 1. Make assignments in the PUA for these symbols [Back to the PUA at this stage seems absurd] 2. Make special markup arrangements in MathML for them [a can of worms] 3. Use existing markup techniques to ensure explicit sizes for these arrows [possible, but then it raises questions of the explicit sizing to be required. In addition, and more importantly, it means that sharing character entity sets with other XML applications, like XHTML, DocBook etc. is not readily possible because one is not in a position to resolve entities to anything but plain text as a result of XML parsing. That leads to difficulties getting MathML adopted we'd like to finesse.] 4. Use the VS1 modifier to declare the longer forms as variants of the corresponding shorter ones. [This would solve the problem since it means any possible character entity definitions would resolve to plain text. The implementers of MathML have to parse for VS1 and handle variant characters already. It means no silly hacks to use these fairly common arrow forms as math operators, which would result from the markup solution. Generally it seems to have few drawbacks.] Therefore the W3C Math WG would like to urge the additoon of the following items to the VS1 variant table: [LONG LEFT DOUBLE ARROW] [LONG RIGHT DOUBLE ARROW] [LONG LEFT AND RIGHT DOUBLE ARR] [LONG LEFT ARROW] [LONG RIGHT ARROW] [LONG LEFT AND RIGHT ARROW] [LONG MAPS TO, LEFTWARD] [LONG MAPS TO, RIGHTWARD] Best regards, Patrick -- end