Re: Is long s a presentation form?

From: Jim Allan (
Date: Fri Nov 08 2002 - 13:32:20 EST

  • Next message: Magda Danish (Unicode): "Question: the german umlaut"

    The long s is indeed a variant of short s. In general it is used in the
    middle of a word, when used, while short s is used at the end of a word,
    but exceptions occur in various scribal traditions, mostly when s occurs
    at the end of a compound in a compound word. Compare modern German where
    (a ligature of long s followed by short s) is employed differently
    from ss.

    Accordingly the decision as to which form to use is not someting to be
    left up to intelligent fonts, any more than capitalization should be,
    though capital letters and small letters (and small capitals) are also
    presentation variants of each other.

    The long s and short s variant situtation is not unique in Unicode. In
    Greek there are the variant forms U+03C3 GREEK SMALL LETTER SIGMA and
    U+03C2 GREEK SMALL LETTER FINAL SIGMA which are parallel to the Latin
    letter forms in usage. Unicode also enclodes the variant U+03F2 GREKK

    In Hebrew final variants of certain letters are also coded separately
    from the normal forms. In the Unicode Standard 8.1 it is noted for

         Certain words, however, are spelled with nominal rather than
         final forms, particularly names and foreign borrowings in
         Hebrew, and some words in Yiddish. Because final form usage is
         a matter of spelling convention, softwware should not
         automatically substitute final forms for nominal forms at the
         end of words. The positional variants should be code directly
         from the codeset and rendered one-to-one via their own
         glyphs--that is, without contextual analysis.

    The Unicode rule seems to be that for characters where variants normally
    vary by position, but not always, to encode both. This allows an author
    or copier of text to freely chose between them by entering the character
    desired which is easier and more robust than depending on differing high
    level public and private protocols and numerous and differing and
    conflicting programs which might be embedded in different fonts.

    In English black letter characters may be used in formal circumstances,
    for example on a presentation scroll, usually with words spelled using
    modern convetions, including universal use of short s. The same font may
    be used for transcribing genuine medieval text using both long s and
    short s.

    One sometimes also wishes to use long s in any Latin letter font for
    quoting from older text where long s was used. This might have been done
    more often in the past if the long s character had been more readily

    It would be more confusing if some fonts reversed the glyphs or
    represented both as the same glyph and others did not. Orthographic
    distinctions should not nomrally be be resprented by a font change.

    Similarly u and v in Elizabethan English writing are often mainly used
    in the reverse of modern convention, with the vowel usually indicated by
    v and the consonant by u. I don't think a modern digitized font based on
    an Elizabethan English font should therefore swap the positions of the

    Jim Allan

    This archive was generated by hypermail 2.1.5 : Fri Nov 08 2002 - 14:43:05 EST