RE: Character identities

From: Marco Cimarosti (
Date: Wed Oct 30 2002 - 12:55:48 EST

  • Next message: P. T. Rourke: "RE: Character identities"

    Kent Karlsson wrote:
    > > I insist that you can talk about character-to-character
    > > mappings only when
    > > the so-called "backing store" is affected in some way.
    > No, why? It is perfectly permissible to do the equivalent
    > of "print(to_upper(mystring))" without changing the backing
    > store ("mystring" in the pseudocode); to_upper here would
    > return a NEW string without changing the argument.

    And that, conceptually, is a character-to-glyph mapping.

    In my mind, you are so much into the OpenType architecture, and so much used
    to the concept that glyphization is what a font "does", that you can't view
    the big picture.

    If you look at Unicode from a platform independent perspective, fonts do not
    necessarily "do" something. In some architectures, fonts are just inert
    repository of glyphs, and the display "intelligence" is somewhere out of the

    > > If the backing store
    > > is not changed, it is only a character-to-glyph mapping,
    > > however complicate and indirect it may be.
    > Yes. But with several font technologies "the user" can affect
    > the mapping in some ways, via "features". [...]

    Even in the simplest of technologies, the user can affect the mapping in
    some way, e.g. using a different font.

    > My claim is that it is a bad idea for fonts (I don't dare
    > say "Unicode font" at this point) to do what *amounts to*
    > such in-effect character mappings *without explicit request*
    > from whoever is "in charge of" the text in some way (author,
    > editor, graphic designer, reader who like to make changes to
    > the text, ...). Such changes should NOT be the result of
    > JUST changing font.

    All undue generalizations of the OpenType paradigm. Not all fonts "do"
    something (let alone doing what you wish them to do); not all font
    technologies have "modes" (better said, *no* font technologies have "modes",
    if not in theory).

    > > To me, a glyph floating atop of letters "a", "o" and "u" is
    > > recognizably a
    > > German umlaut if (a) the text is written in German, and (b)
    > > the glyph has
    > > one of the following shapes:
    > >
    > > 1. Two small "blobs" (e.g. circles, squares, acute accents)
    > > places side by side;
    > I'm going to opt staying on the restrictive side here.
    > Except for the last one, that is a diaeresis, yes. That is the
    > modern standard way of writing "umlaut" in typeset German. The
    > last one is a double acute, which is normally not used for this
    > in German, and it is stretching things a bit too far to consider
    > it a glyph variant of diaeresis.

    I think stretching things is not seeing that the "umlaut" of most Fraktur
    fonts looks like a double acute: a shape which is consistent with the usual
    shape of the dots on "i" and "j".

    BTW, strangely, you don't seem to be worried by the fact that also "i" and
    "í" look the same... What if I use Fraktur for Spanish?

    > > > If (and only if!) the author/editor of the text asks for an
    > > > overscript e should the font produce one. It is not up to
    > > > the font maker to make such substitutions without request,
    > >
    > > Yes. But a font which displays U+0308 with a glyph resembling
    > > the typical
    > > glyph for U+0364 is not "producing" anything; it is not
    > "substituting"
    > > anything with anything else: it is just faithfully
    > > reproducing the text,
    > > according to the content decided by the author *and*
    > according to the
    > > typographical style decided by the font designer.
    > This is not a typographic decision, it is a spelling decision,
    > and not up to the font designer, I'd say. It is a typographic
    > decision whether the diaeresis "digs into" the glyph below, or if
    > an e-above looks like a capital e inside. But spelling changes,
    > whether transient or permanent, should be the "author's" call.

    It is a cat biting its tail (*). If you consider it a "glyph variation", it
    is just a typographic decision; if you consider it a "character change", it
    becomes an orthographic issue.

    But considering a "character change" the fact that a certain code point is
    displayed with a certain glyph is, IMHO, totally out of the letter and
    spirit of the Unicode character-glyph model.

    (*: Am I exporting an Italian idiom or is this used in English too? Anyway,
    it means "a chicken-egg issue")

    _ Marco

    This archive was generated by hypermail 2.1.5 : Wed Oct 30 2002 - 13:31:34 EST