Re: Too narrowly defined: DIVISION SIGN & COLON

From: Ken Whistler <kenw_at_sybase.com>
Date: Tue, 10 Jul 2012 17:05:16 -0700

On 7/10/2012 4:22 PM, Mark Davis ☕ wrote:
> I would disagree about the preference for ratio; I think it is a
> historical accident in Unicode.
>

Not really.

The following pairs dating from Unicode 1.0 were deliberate:

U+002D HYPHEN-MINUS
U+2212 MINUS SIGN

U+002F SOLIDUS (Unicode 1.0 called it "SLASH")
U+2215 DIVISION SLASH

U+005C REVERSE SOLIDUS (Unicode 1.0 called it "BACKSLASH")
U+2216 SET MINUS

U+003A ASTERISK
U+2217 ASTERISK OPERATOR

U+25E6 WHITE BULLET
U+2218 RING OPERATOR

U+2022 BULLET
U+2219 BULLET OPERATOR

U+007C VERTICAL BAR
U+2223 DIVIDES

U+2016 DOUBLE VERTICAL BAR
U+2225 PARALLEL TO

U+003A COLON
U+2236 RATIO

U+007E TILDE
U+223C TILDE OPERATOR

U+00B7 MIDDLE DOT
U+22C5 DOT OPERATOR

If anything, the "accident" is that the use of "!" for factorial was not
distinguished with a separate symbol character. I don't recall the
argument in detail -- it was discussed. But I suspect that it came down
to most of the math operators being in principle distinguishable because
they are rendered on the math centerline, rather than the baseline,
whereas nobody could think of a good reason for a layout distinction
for the factorial -- so it fell instead into the bucket already occupied
by "." as full stop versus decimal point (versus record separator versus...)

Now subsequent history has since led to more systematic distinctions,
both in use and in glyph design, for some of the pairs listed above.
For example, the two tildes generally look different. The SET MINUS was
discovered to actually be distinct from a backslash, with a different
angle and length. And so on. So that has whittled down the list of
characters that people, after the fact, come to think of as accidental
duplicates.

But trying to rationalize these decisions by examining only the latest
charts, while ignoring the history of how these distinctions came about
in the first place is not a productive direction, IMO.

Incidentally, one of the reasons the set of symbols in the U+2200
Mathematical Operators block got a somewhat different treatment than
generic punctuation or other symbols or combining marks, when it comes
to unification versus non-unification decisions back in the original
draft charts in 1989 and 1990 had something to do with the intuition
back then that having unambiguous encodings for the math operators
would be important for machine processing of mathematical data
(as in algebra systems). It isn't so clear now, in retrospect, whether
some of the disunifications were a good idea or not. But those
decisions are what we have inherited in the standard now, for better
or worse.

--Ken
Received on Tue Jul 10 2012 - 19:07:50 CDT

This archive was generated by hypermail 2.2.0 : Tue Jul 10 2012 - 19:07:52 CDT