I would like to endorse Mr. Stolz's suggestion about retaining the
mixed-case version and then provide an example.
At SHARE and SHARE-Europe meetings (I cannot recall which), we had
discussed this very issue and had come to his conclusion. Essentially,
you store the text in mixed case and then include formatting in the text
to cause the text to be rendered in upper-case. For example (I recall
seeing this particular example used before with respect to ambiguity
with Swiss-German spelling) store the text as:
<upper-case start>B.B. mit ... Körpermaßen <upper-case end>= B.B.,
and her remarkable physical measurements
and then render it into upper case. Clearly this is rich text as
opposed to the plain text of Unicode. However, case conversion is
unambiguous and straightforward (without requiring context and an expert
speaker of German).
Edwin F. Hart
Applied Physics Laboratory
11100 Johns Hopkins Road
Laurel, MD 20723-6099
+1-240-228-6926 (from Washington, DC area)
+1-443-778-6926 (from Baltimore area)
From: Otto Stolz [SMTP:email@example.com]
Sent: 20 July, 1998 11:08
To: Unicode List
Subject: Re: Sharp S
> And U+00DF for German has the uppercase "SS",
> but "SS" does not generally lowercase to U+00DF (unless you do
> context analysis on the data).
> Which is especially unreliable now that the German High Court
> the spelling reform.
Ar 08:44 -0700 1998-07-17, scríobh Berthold Frommann:
> the very fact that "sharp s" has the uppercase "SS" won't
change at all.
There is indeed a change (though a minor one), in the uppercasing rules
* According to the old rules, it was possible to uppercase Sharp-S
giving "SZ" rather than the common "SS", to avoid ambiguities. The only
known case where this rule applied, at all, was "Masse" (= mass, bulk)
vs. "Maße" (= measurements).
* In contrast, the new rule, §25E3, allows only "SS", cf.
<http://www.ids-mannheim.de/grammis/reform/a2-3.html#25E3> > (in
Am 1998-7-17 um 12:25 hat Michael Everson geschrieben:
> No, but context analysis has to be based on something. You
> existing spellcheck dictionaries etc. because the rules have
Ah, all of the rules haven't changed-though it is correct that one rule
governing the Sharp-S vs. Double-S has been dropped resulting in more
instances of Double-S than before.
However, capitalizing is an irreversible process; trying to lower-case
capitalized German text is a rather ambitious endeaveour which almost
certainly is bound to failure. I would recommend to keep the original
mixed-case version, wherever possible.
Generating the correct mixed-case spelling from a capitalized version
involves deep linguistic analysis, and even knowledge of the real world.
Notorious Examples where real-world knowledge is needed:
ER HAT IN MOSKAU LIEBE GENOSSEN.
Er hat in Moskau liebe Genossen. = At Moskow, he has got dear comrades.
Er hat in Moskau Liebe genossen. = At Moskow, he has enjoyed love.
BRIGITTE BARDOT MIT IHREN BEACHTLICHEN KÖRPERMASSEN
B.B. mit ... Körpermaßen = B.B., and her remarkable physical
B.B. mit ... Körpermassen = B.B., and her considerable bodyly masses
Examples for the linguistic difficulties that even the newest German
checking software usually does not get right (from: "Methode
and "Hilflose Helferlein", both by Dieter E. Zimmer,
im Besonderen wurde ..., = in particular
im besonderen Falle wurde ... = in that particular case
im Folgenden ... = in the sequel
im folgenden Absatz ... = in the next paragraph
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:40 EDT