Re: Too narrowly defined: DIVISION SIGN & COLON from Hans Aberg on 2012-07-12 (Unicode Mail List Archive)

From: Hans Aberg <haberg-1_at_telia.com>
Date: Thu, 12 Jul 2012 13:28:54 +0200

On 12 Jul 2012, at 10:44, Julian Bradfield wrote:

> [ Please don't copy me on replies; the place for this is the mailing
> list, not my inbox, unless you want to go off-list. ]

Check if you can set the mailing list preferences. On some lists, it is very important to cc, as those that post to the list may not be on the list, though that is not the case here.

> On 2012-07-11, Hans Aberg <haberg-1_at_telia.com> wrote:
>
>> Unicode has added all the characters from TeX plus some, making it
>> possible to use characters in the input file where TeX is forced to
>> use ASCII. This though changes the paradigm, and it is a question of
>> which paradigm one wants to adhere to.
>
> This doesn't seem to make much sense, or have much truth, to me.
...
> There are many characters that TeX users use that are not in
> Unicode.

All standard characters from TeX, LaTeX, and AMSTeX should be there, and there are now STIXFonts <http://stixfonts.org/> implementing them. In math, you can always invent your own characters and styles, in fact you could do that with any script, but it is not possible for Unicode to cover that. There are though a public use area, where one can add ones own characters.

> Because TeX is agnostic about such matters, one can set up any
> convenient encoding for the input data (which is really the source
> code of a program). For example, I have written documents in ASCII,
> Latin-1, Big5, GB, UTF-8 and probably others. This is very convenient;
> but it's only a convenience.

UTF-8 only is simplest for the programmer that has to implement it.

LuaTeX and the older XeTeX support UTF-8. They are available in TeX Live.
http://www.tug.org/texlive/

> If one uses UTF-8, then one has the problem of how to deal with the
> case where Unicode trespasses on TeX's territory, by specifying font
> styles.
> This is not hard: for example, the obvious thing to do is to
> arrange for the Unicode MATHEMATICAL SMALL ITALIC M to be an
> abbreviation for \mathit{m}, and so on.
> Note, incidentally, that this is not the same as the meaning of a
> plain ASCII (or EBCDIC) "m" in TeX. In TeX math mode, the meaning of
> "m" is dependent on the currently selected math font family: just as
> in plain text, the font of of "m" depends on the currently selected
> text font.
>
> One problem, of course, is that there is no MATHEMATICAL ROMAN set of
> characters. This is one of the biggest botches in the whole
> mathematical alphanumerical symbol botch.

This was discussed here before; the LaTeX unicode-math package has options to control that (see its manual). For example, one gets a literal interpretation by:
  \usepackage[math-style=literal,colon=literal]{unicode-math}
  \defaultfontfeatures{Ligatures=TeX}
  \setmainfont{XITS}
  \setmathfont{XITS Math}

Here, the XITS fonts are used.
http://www.khaledhosny.org/node/143

> If you encode semantic font
> distinctions without requiring the use of higher-level markup, then
> you need to encode also letters that are semantically distinctively
> roman upright.

It has already been encoded as mathematical style, see the "Mathematical Alphanumeric Symbols" here:
http://www.unicode.org/charts/

And is available in STIX and the XITS fonts, plus some, as mentioned in the README of the before mentioned unicode-math package.

> A more general problem is that which font styles are meaningful,
> depends on the document. For example, I give lectures and talks, and I
> set my slides in sans-serif. As I don't (usually) use distinctive
> sans-serif symbols in my work, the maths is all in sans-serif
> too: form, not content. But what then should I see if I type a Unicode
> mathematical italic symbol in my slides? Serif, or sans-serif?

It is up to you. The unicode-package, mentioned above, has options to control that.

It is traditional in pure math, and also in the physics books have looked into, to always use serif. Possibly sanf-serif belongs to another technical style. Unicode makes it possible to mix these styles on the character level, if you so will.

Hans
Received on Thu Jul 12 2012 - 06:30:14 CDT

This archive was generated by hypermail 2.2.0 : Thu Jul 12 2012 - 06:30:15 CDT