Re: Too narrowly defined: DIVISION SIGN & COLON

From: Julian Bradfield <jcb+unicode_at_inf.ed.ac.uk>
Date: Thu, 12 Jul 2012 14:54:30 +0100

On 2012-07-12, Hans Aberg <haberg-1_at_telia.com> wrote:
>> There are many characters that TeX users use that are not in
>> Unicode.
>
> All standard characters from TeX, LaTeX, and AMSTeX should be there,

What's a standard character? There's no such thing.
To take a random entry from the LaTeX Symbol Guide, where is the
\nrightspoon symbol from the MnSymbol package? (A negated multimap
symbol.)

Not to mention the symbols I've used from time to time, because

> them. In math, you can always invent your own characters and styles,

people do.

> in fact you could do that with any script, but it is not possible
> for Unicode to cover that. There are though a public use area, where
> one can add ones own characters.

You mean "private use". Crazy thing to do, because then you have to
worry about whether your PUA code point clashes with some other
author's PUA code point.

>> Because TeX is agnostic about such matters, one can set up any
>> convenient encoding for the input data (which is really the source
>> code of a program). For example, I have written documents in ASCII,
>> Latin-1, Big5, GB, UTF-8 and probably others. This is very convenient;
>> but it's only a convenience.
>
> UTF-8 only is simplest for the programmer that has to implement it.

Some of us are more concerned with users than programmers. Beside, all
the work for the "legacy" encodings has already been done. I wouldn't
ever want to go back to "ISO alphabet soup" for Latin etc., but for
CJK, the legacy codings are still sometimes convenient - for example,
if I write in Big5, I don't have to worry about telling my editor to
find a traditional Chinese font rather than a simplified or japanese
font. It uses a Big5 font, and that's it.

> LuaTeX and the older XeTeX support UTF-8. They are available in TeX Live.
> http://www.tug.org/texlive/

They aren't TeX. Neither working mathematicians nor publishers nor
typesetters like dealing with constantly changing extensions and
variations on TeX - one of the biggest selling points of TeX is
stability. (Defeated somewhat by the instability of LaTeX and its
thousands of packages, but that's another story.)
If I need to write complex - or even bidi - scripts routinely, I'd
probably be forced into one of them; but the typical mathematician
doesn't.

>>
>> One problem, of course, is that there is no MATHEMATICAL ROMAN set of
>> characters. This is one of the biggest botches in the whole
>> mathematical alphanumerical symbol botch.
>
> This was discussed here before; the LaTeX unicode-math package has options to control that (see its manual). For example, one gets a literal interpretation by:

Exactly. TeX can do what it likes. But you said it was an incompatibility
with Unicode that TeX sets plain ASCII math letters as italic,
implying that TeX should not be allowed to do what it likes.

>> If you encode semantic font
>> distinctions without requiring the use of higher-level markup, then
>> you need to encode also letters that are semantically distinctively
>> roman upright.
>
> It has already been encoded as mathematical style, see the "Mathematical Alphanumeric Symbols" here:
> http://www.unicode.org/charts/

*You* look. The plain upright style is unified with the BMP characters.

>> A more general problem is that which font styles are meaningful,
>> depends on the document. For example, I give lectures and talks, and I
>> set my slides in sans-serif. As I don't (usually) use distinctive
>> sans-serif symbols in my work, the maths is all in sans-serif
>> too: form, not content. But what then should I see if I type a Unicode
>> mathematical italic symbol in my slides? Serif, or sans-serif?

>
> It is up to you. The unicode-package, mentioned above, has options to control that.

Of course it's up to me. I'm glad you agree. So why say that it's an
incompatibility with Unicode that TeX (by default) displays ASCII as
italic in maths? Are you changing your mind on that? I welcome that if
so, as that was what I found surprising.

(And, of course, it's much easier to use the established TeX
mechanisms for controlling these things, than to learn more options
for a package to allow me to use symbols that are hard to type and
even harder to distinguish clearly on screen.)

> It is traditional in pure math, and also in the physics books have looked into, to always use serif. Possibly sanf-serif belongs to another technical style. Unicode makes it possible to mix these styles on the character level, if you so will.

It's also traditional, for mostly good reasons to do with the limited
resolution of projectors, to use sans-serif in presentations. The only
reason that most people still have serifed maths is that they don't
know how to do otherwise (\usepackage{cmbright} is enough for most
people, if only they knew).

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
Received on Thu Jul 12 2012 - 08:56:28 CDT

This archive was generated by hypermail 2.2.0 : Thu Jul 12 2012 - 08:56:29 CDT