Re: Anything from the Symbol font to add along with W*dings? from Jukka K. Korpela on 2011-08-14 (Unicode Mail List Archive)

From: Jukka K. Korpela <jkorpela_at_cs.tut.fi>
Date: Mon, 15 Aug 2011 07:55:12 +0300

15.8.2011 0:47, Asmus Freytag wrote:

> Not all documents are HTML or CSS.

The Numericana page that was cited argues for using Symbol font on web
pages, and I showed a few errors in its argumentation in that respect.
I also wrote: “However, it might be argued that the Symbol font has been
used in text documents (normally not plain text but text that may
contain different fonts) and that the characters so used are existing
usage that needs to be taken into account. There are two big ifs here:
if this involves symbols that do not exist as Unicode characters and if
the existing usage is relevant enough, then there might be something to
be consider for inclusion into Unicode. The burden of proof lies, of
course, on both ifs, with those who propose new characters.”

Maybe I should have stopped there…

> The question here is whether it's useful to add code additional points
> to allow plain-text coverage of certain widely spread fonts (of which
> "the" symbol font is one) so that it's possible to use, for example,
> automated processes to re-encode font runs in older documents to make
> them more fully portable.

“Rich text” that uses the Symbol font can be converted to plain Unicode
text, naturally losing any specific formatting such as particular shapes
of glyphs, to the extent that the glyphs of Symbol can be identified as
representing certain Unicode characters.

If a font vendor has decided to include glyphs that cannot be so
resolved, I would say that it is up to the vendor to step up and suggest
what the glyphs stand for as text characters and, if they do not exist
in Unicode, make a proposal on encoding them.

I wrote previously that Symbol 0xD6 is a particular glyph for SQUARE
ROOT U+221A and Symbol 0x60 has behavior that does not match Unicode
coding principles (it combines with the _next_ character); it seems that
the latter applies to some implementations only, whereas in others, it
is a spacing character, making its encoding as a Unicode character even
more questionable.

The Symbol font also contains both sans-serif and serif variants of some
characters like the registered sign “®.” If there is evidence that there
are texts that use both variants, then one might ask whether an addition
should be made to let this distinction be made in plain text. I would
say no (in this rather hypothetical issue), since the presence of a text
character in two font shapes in a single font does not imply that the
shapes need to be encoded as separate Unicode characters. A font may
well contain alternative glyphs for a character, and it should be up to
the font implementation and use to control the use of alternative glyphs
if needed.

-- 
Yucca, http://www.cs.tut.fi/~jkorpela/

Received on Mon Aug 15 2011 - 00:00:17 CDT

This archive was generated by hypermail 2.2.0 : Mon Aug 15 2011 - 00:00:19 CDT