Long S (was: Latin ligatures and Unicode)

From: Otto Stolz (Otto.Stolz@uni-konstanz.de)
Date: Mon Dec 20 1999 - 06:19:46 EST

Am 1999-12-19 um 3:54 h hat Eberhard Pehlemann geschrieben:
> Is this really another character than U+0073 LATIN SMALL LETTER S ?

In German (written in Fraktur, or in German hand-writing), the distinction
between long, and round, S is an orthographic feature, a minimal pair of
words being
- Wachstube (wax tube)
- Wach<U+017F>tube (guard's room).
These are pronounced, and hyphenated, differently, viz.
- Wachs-tube [v'akstu:be], and
- Wach-stube [v'axStu:be], respectively,
where [v] as "v" in English "van",
      [x] as "ch" in Scots "loch",
      [S] as "sh" in English "shoe",
      [a] as "u" in English "hut",
      [e] as "e" in English "bent".

In contemporary spelling (if in Latin script), there is only one lower-
case S, which is normally identified with U+0073. Hence, word-pairs such
as the example above, are not discriminated in spelling, and you have to
deduce their respective meaning, pronounciation, and hyphenation from the
context. (Though the Long S was used even in Roman-typeface print, around
1920, in some circels.)

So you need some way of discriminating between long and round S, in order
to reproduce German spelling until 1941, at least. This is one reason for
including Long-S (U+017F) in Unicode: you can always represent Round-S by
Small S (U+0073) -- though originally, the Long S was perceived as the nor-
mal case, and the round S a special character.

As Long, and Round, S in German are related to the syllable boundaries,
another way to handle them could be to encode all small S alike and render
a small S before any non-Character (including soft-hyphen and ZWNJ) as a
round S, and as a long S in all other places. However, I do not know
whether this is feasable with other languages, and it is clearly not the
way Unicode has gone.

Best wishes,
   Otto Stolz

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:56 EDT