RE: Difference between EM QUAD and EM SPACE

From: Kenneth Whistler (kenw@sybase.com)
Date: Mon Jul 10 2000 - 20:06:19 EDT


Jonathan wrote:

> In TeX, the difference is that an EM QUAD (\qquad) and an EN QUAD
> (\quad) provide spaces that are legitimate breakpoints for lines within a
> paragraph; while EM SPACE, EN SPACE (\enspace) and THIN SPACE (\thinspace)
> produce horizontal space that cannot cause a line-break.

This is interesting, but it strikes me as a TeX convention that does
not follow general typography -- a kind of a way to get breaking or
non-breaking spaces of specified widths.

>
> My assumption on reading the Unicode standard was that this was the
> intention---though it is not spelled out anywhere.

It was not.

> Maybe a clarification
> would be worthwhile. If so, the fact that TeX is so widely implemented and
> used---at least within the arena of technical documents---might make it
> worthwhile to preserve those characteristics.
>

Well, the problem is that you also have to contend now with the specification
of UTR #14, Line Breaking Properties, which, in the accompanying data
file, LineBreak.txt, gives the following line break properties to spaces:

0020;SP;SPACE

00A0;GL;NO-BREAK SPACE
 
2000;BA;EN QUAD
2001;BA;EM QUAD
2002;BA;EN SPACE
2003;BA;EM SPACE
2004;BA;THREE-PER-EM SPACE
2005;BA;FOUR-PER-EM SPACE
2006;BA;SIX-PER-EM SPACE
2007;GL;FIGURE SPACE
2008;BA;PUNCTUATION SPACE
2009;BA;THIN SPACE
200A;BA;HAIR SPACE
200B;ZW;ZERO WIDTH SPACE

where:

SP = line break opportunity after the character; enables indirect breaks
BA = line break opportunity after the character
GL = prohibit line breaks before or after
ZW = optional break

Basically, you are saying that TeX treats EM SPACE, EN SPACE, and THIN SPACE
as GL, rather than BA. That is an additional semantic that TeX is applying
to those characters that is unlikely to be easily interoperable with some
other application. But then given that other aspects of TeX line-breaking
would also not be interoperable, since they rely on the particular details
of TeX's optimizations for line-breaking, I would say that the safer thing
to do would be only to rely on SPACE and NO-BREAK SPACE (and possibly FIGURE
SPACE) to make explicit distinctions in breaking behavior for text
interchange, but not to expect the fixed-width spaces to survive very reliably
into formatted text when interchanged in plain text.

--Ken



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:05 EDT