FW: Sharp S

From: Hart, Edwin F. (HARTEF1@aplmsg.jhuapl.edu)
Date: Wed Jul 29 1998 - 09:28:51 EDT

I would like to endorse Mr. Stolz's suggestion about retaining the
mixed-case version and then provide an example.

At SHARE and SHARE-Europe meetings (I cannot recall which), we had
discussed this very issue and had come to his conclusion. Essentially,
you store the text in mixed case and then include formatting in the text
to cause the text to be rendered in upper-case. For example (I recall
seeing this particular example used before with respect to ambiguity
with Swiss-German spelling) store the text as:

  <upper-case start>B.B. mit ... Körpermaßen <upper-case end>= B.B.,
and her remarkable physical measurements

and then render it into upper case. Clearly this is rich text as
opposed to the plain text of Unicode. However, case conversion is
unambiguous and straightforward (without requiring context and an expert
speaker of German).


Edwin F. Hart
Applied Physics Laboratory
11100 Johns Hopkins Road
Laurel, MD 20723-6099
+1-240-228-6926 (from Washington, DC area)
+1-443-778-6926 (from Baltimore area)
+1-240-228-1093 (fax)
edwin.hart@jhuapl.edu <mailto:edwin.hart@jhuapl.edu>

From: Otto Stolz [SMTP:stolz@iris.rz.uni-konstanz.de]
Sent: 20 July, 1998 11:08
To: Unicode List
Subject: Re: Sharp S

> And U+00DF for German has the uppercase "SS",
> but "SS" does not generally lowercase to U+00DF (unless you do
> context analysis on the data).

> Which is especially unreliable now that the German High Court
has approved
> the spelling reform.

Ar 08:44 -0700 1998-07-17, scríobh Berthold Frommann:
> the very fact that "sharp s" has the uppercase "SS" won't
change at all.

There is indeed a change (though a minor one), in the uppercasing rules
for Sharp-S:
* According to the old rules, it was possible to uppercase Sharp-S
giving "SZ" rather than the common "SS", to avoid ambiguities. The only
known case where this rule applied, at all, was "Masse" (= mass, bulk)
vs. "Maße" (= measurements).
* In contrast, the new rule, §25E3, allows only "SS", cf.
<http://www.ids-mannheim.de/grammis/reform/a2-3.html#25E3> > (in
Am 1998-7-17 um 12:25 hat Michael Everson geschrieben:
> No, but context analysis has to be based on something. You
can't use
> existing spellcheck dictionaries etc. because the rules have
all changed.

Ah, all of the rules haven't changed-though it is correct that one rule
governing the Sharp-S vs. Double-S has been dropped resulting in more
instances of Double-S than before.
However, capitalizing is an irreversible process; trying to lower-case
capitalized German text is a rather ambitious endeaveour which almost
certainly is bound to failure. I would recommend to keep the original
mixed-case version, wherever possible.
Generating the correct mixed-case spelling from a capitalized version
involves deep linguistic analysis, and even knowledge of the real world.
Notorious Examples where real-world knowledge is needed:
Er hat in Moskau liebe Genossen. = At Moskow, he has got dear comrades.
Er hat in Moskau Liebe genossen. = At Moskow, he has enjoyed love.
B.B. mit ... Körpermaßen = B.B., and her remarkable physical
B.B. mit ... Körpermassen = B.B., and her considerable bodyly masses
                (i.e. corpulence)
Examples for the linguistic difficulties that even the newest German
checking software usually does not get right (from: "Methode
and "Hilflose Helferlein", both by Dieter E. Zimmer,
html> >,
html> >):
im Besonderen wurde ..., = in particular
im besonderen Falle wurde ... = in that particular case
  im Folgenden ... = in the sequel
  im folgenden Absatz ... = in the next paragraph

                                Best wishes,
        Otto Stolz

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:40 EDT