Re: Too narrowly defined: DIVISION SIGN & COLON from Hans Aberg on 2012-07-12 (Unicode Mail List Archive)

From: Hans Aberg <haberg-1_at_telia.com>
Date: Thu, 12 Jul 2012 18:27:23 +0200

On 12 Jul 2012, at 15:54, Julian Bradfield wrote:

> On 2012-07-12, Hans Aberg <haberg-1_at_telia.com> wrote:
>>> There are many characters that TeX users use that are not in
>>> Unicode.
>>
>> All standard characters from TeX, LaTeX, and AMSTeX should be there,
>
> What's a standard character? There's no such thing.
> To take a random entry from the LaTeX Symbol Guide, where is the
> \nrightspoon symbol from the MnSymbol package? (A negated multimap
> symbol.)
>
> Not to mention the symbols I've used from time to time, because

You tell me, because I posted a request for missing characters in different forums. Perhaps you invented it after the standardization was made?

>> them. In math, you can always invent your own characters and styles,
>
> people do.

You and others knowing about those characters must make proposals if you want to see them as a part of Unicode.

>> in fact you could do that with any script, but it is not possible
>> for Unicode to cover that. There are though a public use area, where
>> one can add ones own characters.
>
> You mean "private use". Crazy thing to do, because then you have to
> worry about whether your PUA code point clashes with some other
> author's PUA code point.

There is some system for avoiding that. Perhaps someone else here can inform.

>>> Because TeX is agnostic about such matters, one can set up any
>>> convenient encoding for the input data (which is really the source
>>> code of a program). For example, I have written documents in ASCII,
>>> Latin-1, Big5, GB, UTF-8 and probably others. This is very convenient;
>>> but it's only a convenience.
>>
>> UTF-8 only is simplest for the programmer that has to implement it.
>
> Some of us are more concerned with users than programmers.

Well, if the programmers don't implement, you are left out in the cold.

> Beside, all
> the work for the "legacy" encodings has already been done. I wouldn't
> ever want to go back to "ISO alphabet soup" for Latin etc., but for
> CJK, the legacy codings are still sometimes convenient - for example,
> if I write in Big5, I don't have to worry about telling my editor to
> find a traditional Chinese font rather than a simplified or japanese
> font. It uses a Big5 font, and that's it.

Before UTF-8, in the 1990s, some Russians used multi-encoded text files with TeX/LaTeX, but I doubt they do that anymore. Use whatever you like.

>> LuaTeX and the older XeTeX support UTF-8. They are available in TeX Live.
>> http://www.tug.org/texlive/
>
> They aren't TeX.

Clearly not, since TeX is not developed anymore.

> Neither working mathematicians nor publishers nor
> typesetters like dealing with constantly changing extensions and
> variations on TeX - one of the biggest selling points of TeX is
> stability. (Defeated somewhat by the instability of LaTeX and its
> thousands of packages, but that's another story.)
> If I need to write complex - or even bidi - scripts routinely, I'd
> probably be forced into one of them; but the typical mathematician
> doesn't.

I do not see your point here.

>>> One problem, of course, is that there is no MATHEMATICAL ROMAN set of
>>> characters. This is one of the biggest botches in the whole
>>> mathematical alphanumerical symbol botch.
>>
>> This was discussed here before; the LaTeX unicode-math package has options to control that (see its manual). For example, one gets a literal interpretation by:
>
> Exactly. TeX can do what it likes.

No. TeX cannot handle UTF-8, and I recall LaTeX's capability to emulate that was limited.

> But you said it was an incompatibility
> with Unicode that TeX sets plain ASCII math letters as italic,
> implying that TeX should not be allowed to do what it likes.

In LuaTeX or XeTeX, it is obviously relative the original TeX definitions, those that most are used to.

>>> If you encode semantic font
>>> distinctions without requiring the use of higher-level markup, then
>>> you need to encode also letters that are semantically distinctively
>>> roman upright.
>>
>> It has already been encoded as mathematical style, see the "Mathematical Alphanumeric Symbols" here:
>> http://www.unicode.org/charts/
>
> *You* look. The plain upright style is unified with the BMP characters.

Yes, that is why the Unicode paradigm departs from the TeX one.

>>> A more general problem is that which font styles are meaningful,
>>> depends on the document. For example, I give lectures and talks, and I
>>> set my slides in sans-serif. As I don't (usually) use distinctive
>>> sans-serif symbols in my work, the maths is all in sans-serif
>>> too: form, not content. But what then should I see if I type a Unicode
>>> mathematical italic symbol in my slides? Serif, or sans-serif?
>
>>
>> It is up to you. The unicode-package, mentioned above, has options to control that.
>
> Of course it's up to me. I'm glad you agree. So why say that it's an
> incompatibility with Unicode that TeX (by default) displays ASCII as
> italic in maths? Are you changing your mind on that? I welcome that if
> so, as that was what I found surprising.

You have yourself noted that the BMP characters must be used for upright for consistent Unicode use, incompatible with TeX which sets them as italic.

> (And, of course, it's much easier to use the established TeX
> mechanisms for controlling these things, than to learn more options
> for a package to allow me to use symbols that are hard to type and
> even harder to distinguish clearly on screen.)

It is because there are currently no convenient input methods, also mentioned before in this thread.

>> It is traditional in pure math, and also in the physics books have looked into, to always use serif. Possibly sanf-serif belongs to another technical style. Unicode makes it possible to mix these styles on the character level, if you so will.
>
> It's also traditional, for mostly good reasons to do with the limited
> resolution of projectors, to use sans-serif in presentations. The only
> reason that most people still have serifed maths is that they don't
> know how to do otherwise (\usepackage{cmbright} is enough for most
> people, if only they knew).

Yes, low resolution is a motivation for using sans-serif, but that may change in view of the new high resolution displays coming by about at this time.

Hans
Received on Thu Jul 12 2012 - 11:32:33 CDT

This archive was generated by hypermail 2.2.0 : Thu Jul 12 2012 - 11:32:40 CDT