From: Asmus Freytag (email@example.com)
Date: Mon May 07 2007 - 18:35:09 CDT
On 5/7/2007 3:06 PM, Kenneth Whistler wrote:
>>> Adam Twardoch wrote:
>>> ... would make as little sense as encoding the
>>>> uppercase "ÃŸ" as "S ZWJ S".
> But of course stating that way distorts the sense of the argument,
> anyway. The counterproposal is to say that given existing
> Unicode conventions, one could simply say that in those minority
> contexts where one wishes to display an <S, S> sequence as
> an uppercase [ß], use of a ZWJ to maintain a plain text distinction
> and a ligature from a font for presentation could suffice.
> That isn't *encoding* uppercase [ß] as "S ZWJ S"; it is
> displaying <S, ZWJ, S> with a ligature uppercase [ß] glyph.
In light of what the discussion has yielded so far, I find this line of
disingenuous. It has been demonstrated, that in the context of German
the choice to retain ß in ALL UPPERCASE is a clear statement that "SS"
the desired *text*.
So, the question can never be whether one wants another glyph for "SS",
one needs another form of ß.
> And John Hudson's argument about this is that using existing
> mechanisms might work better as a practical matter, because
> it has graceful fallback behavior.
If fallback behavior was the issue, falling back to a lowercase ß would
> But those advocating *for* uppercase [ß] don't seem to be
> making practical arguments here, as best I can tell. The
> argumentation is *essentialist* in nature: uppercase [ß] *is*
> a letter, not a ligature, *therefore* it *must* be encoded
> as a character.
The argument is essentially essentialist, but it doesn't proceed the way
are trying to summarize here. The argument is that ß is a letter, and
reform has endorsed that view by giving that letter a less ambiguous
in the (standard) orthography, where it now is used *in contrast* to
'ss' to mark
Therefore, the argument goes, uppercase ß is *in essence* a form of ß, and
never a form of "SS".
> I've been around the bend enough times to realize there isn't
> much mileage to be gained in trying to argue down
> essentalists, but I would like them to at least consider
> the parallel with folks who have been arguing for years,
> for example, that "ksa" in Devanagari *is* a letter, and therefore
> must be encoded as a character.
Arguments about ksa in Devanagari appear little helpful in this context,
aspersions thrown at (groups of) people.
>>>> I strongly believe that "SS" is an anachronic, still-in-use but
>>>> slowly-to-vanish poor man's solution to write the uppercase "ß".
> I'm perfectly willing to accede that writing systems change,
> and the status of elements within them may change diachronically.
> There are plenty of such examples in the Latin script, as we
> all know. And it may well be that ß is in the middle of such
> a transition. As Asmus noted, its "letterhood" is now officially
> recognized in the German orthography, and as Adam and others
> talking about the nature of Latin as a bicameral script have
> been wont to point out, that means growing pressure for it
> to acquire an uppercase form, whether we like it or not. Certainly
> this echoes the process whereby many lowercase IPA use letters
> have acquired uppercase forms by dint of usage in language
The fact that there is persistent minority variation in the orthography
on this issue is in
itself very telling, because of the fact that the popular view of German
writers is that
their orthography unambiguously follows mandated rules. (Contrast this
to the popular
view of Americans about their own orthography, and ponder this "thru"
> But Adam here is talking as if the future course of history
> here is predestined.
The "slowly-to-vanish" in the above quote may make it seem that way, but
over-interpret that. *If* you care to make a prediction about the
future, a continuation of the
trend seems likely, however, it's taken over 10 years to get the latest
digested, and that one happened 95 years after the 1901 reform. In such
a lot of things can happen, and Adam surely is aware of that as well.
> There apparently is a camp of people
> who think that not only is uppercase [ß] a letter and
> deserving of encoding as a character, but it will inevitably
> be reckoned as the rightful uppercase mapping of ß, with
> further attendant changes to formal orthographic rules.
I'm sure there is. You can call them "friends of the uppercase ß" if
that makes you feel
better. I think it's great when letters can have fan-clubs. However, the
fact that there
are some people who are not only fascinated by watching slow changes in
take place but wish to take the side in favor of a particular direction,
does not enter
into the analysis of the minority orthography and its use of ß in ALL
> John Hudson responded:
>>> I suspect, and indeed hope, that you are right. ...[but] having a
>>> single lowercase character with two different uppercase mappings, one
>>> currently standard and enshrined in existing casing rules and
>>> implementations, one that might one day become standard and require
>>> some kind of overriding implementation, seems to me a bit of a
>>> standardisation and software development nightmare.
> And Asmus replied:
>> The 'nightmare' is not with the characters, but with the potential that
>> officially sanctioned rules might change.
> ... which Adam has as much as said is the future course of history.
And which Asmus has said that it's too soon to tell.
> But I don't think Asmus' pooh-poohing the concerns of John about
> the character implementation issue does justice to the real
> issues here.
> The proposal formally suggests that uppercase [ß] get a lowercase
> mapping to ß, but that, for stability, ß not get an uppercase
> mapping to uppercase [ß]. That would be, to the best of my knowledge,
> an unprecedented kind of case mapping in the UCD,
The issue itself is not precedented. A formal solution would
(hypothetically) be to have localization based on the level of
'orthography' and not on the level of language-country pairs. This
cannot be a real solution since trying to tag texts correctly and making
sure software is configured at all times correctly is unrealistic to the
extreme - the more so, as the ALL UPPERCASE context itself is so
restricted in German writing.
Therefore the proposers wisely don't suggest such a thing, but consider
a certain lack of automatic case conversion in this instance a small
price to pay - as long as the text, once encoded, is transmitted and
Given that ß is used in ALL UPPERCASE context today, even in official
the existing case mappings tell only a partial story. Mapping ß to
itself on uppercasing
would match what I have called the 'minority' usage. Allowing a true
that maps to lowercase ß leaves that problem completely unaddressable,
no new problems. It is already impossible to round-trip ß from lower to
forms and back. All the change would do, is to allow (manual) support
for a character
that fits somewhat better in an ALL UPPERCASE environment, but that is case
folded correctly to ß and not ss.
For users of the majority orthography, the side-effect of these mappings
is to allow easy conversion from non-standard to standard texts via
repeated case mappings. An unintended benefit, but, for the current (not
future) situation, a plus.
> and has its
> own stability issue: there will be *years* of carping and rabblerousing
> that will follow on from that decision, as the camp which believes
> that the natural, self-evident, and essential casemapping
> relations should be:
> ß <--> uppercase [ß]
> ss <--> SS
> will attempt to get the UnicodeData case mappings (and implementations
> that follow from that) and case foldings "fixed" to reflect that
> inevitable rightness.
Well, it's always easy to rouse a rabble, but to make this change would
require that the standard orthography be changed, so it's a simple
matter of directing said rabble in that direction. If they can convince
their fellow countrymen that it's worth enduring another orthography
reform, then so much more power to them (hardly likely, *if* one cared
to predict the future based on recent experience).
> But any changes in such a direction *are* the kind of software
> development nightmare that John Hudson is warning about.
And as I observed, as quoted below, that applies to *any* change. The
unhappiness with the current crop of spell-checkers, trying to
faithfully implement the last reform, has reached endemic proportions in
Germany: ink is spilled by the book on that subject. Unicode can't
simply say: "no reforms, you've got what you got, and don't throw a fit"
and be done with it.
> I won't bother trying to get them to pledge that they won't ask
> for that, because they may well say so now (as the proposal does),
> but then simply turn around and ask for the changes anyway.
Personalizing it in this way is not helpful. If the orthography changes,
Unicode will be asked to change, and will have to follow. If the
orthography doesn't change, Unicode may be asked, whether by the authors
of the current proposal or by unrelated enthusiasts, but in neither case
does it have to act.
> Asmus went on to say:
>> There's absolutely nothing
>> that can prevent such a change, even if it were not to involve new
>> characters. For example, assume that the solution of using 'SZ' in
>> contrast to 'SS' became official. It would equally invalidate all
>> software and throw confusion even into (fuzzy) search and sorting, with
>> the potential of dragging lower case 'sz' into the fray.
> No doubt that would be the case.
And, as we argued that the distinction between "SS" and something else
(expressed as ß in lower case) is what's ultimately desired, it's not
clear that one can predict that it will be the uppercase ß. It may well
be the "SZ", even though, today, that does not seem likely (because it's
even uglier than the uppercase ß, even though it's had precedent...).
>> That's why the proposers, correctly in my opinion, did not base their
>> proposal on speculation on the direction of potential future reform, but
>> limited themselves to documenting the existing usage, which clearly can
>> be supported and deserves to be supported.
> But I just don't buy that argument. The "existing usage" can
> be supported with existing characters and with properly designed
> fonts, actually.
Not unless you are thinking about fonts that have a variant form of ß in
ALL UPPERCASE context. There are fonts that are ALL UPPERCASE, such as
Augsburger Titling, but asking for a font change for uppercase seems
somehow not following the precedent set by Unicode's encoding model.
Had the encoding model been one of using a COMBINING UPPERCASE FORM
SELECTOR applied to the lower case character, the whole issue would now
not be discussed, or not in the same way.
> I think this comes back down to the essentialist
> argument again. There is a group of German users and scholars
> who believe that uppercase [ß] *is* a character, and it is
> *that* which deserves to be supported, apparently.
Based on the view that ß *is not* a form of "ss".
Once you accept that, and recent changes have made that more compelling,
then you ask yourself why it is necessary to insist that it cannot be
represented in distinction to SS in uppercase, and why, if it is a letter,
those that do want to retain the distinction should do so
(unaccountably) via glyph selection only, when
at the core of its, it's a semantic issue.
> I have yet to see cogent technical arguments for what real
> issues are being addressed here, other than the need to *display*
> uppercase [ß] glyphs on demand. The text processing arguments
> have all been mumbo-jumbo and handwaving so far.
I think the arguments are no less cogent than the arguments why you
should not type 8 when you
mean B and rely on font mechanisms to address the inconsistency from
> Furthermore, while the proposers may not have "base[d] their
> proposal on speculation on the direction of potential future
> reform", it is pretty clear from the discussion on this list
> that the decision to encode an uppercase [ß] is smack in the
> middle of such speculation, and encoding it will be used as
> a lever to make further changes.
Here you take Unicode's role much too serious in the context of ongoing
developments in German writing. Even if Unicode was so important that it
be used to lever a change -- it is not Unicode's task to take sides in
community of users wants to evolve their shared writing system. That
just the kind of thing that Japanese users have always (unjustly)
of doing. So why start now?
>> I remember writing before somewhere that I think their proposal should
>> be accepted as presented.
> Ah, but it has been awhile since I've seen a single character
> encoding proposal engender this much debate and controversy.
> It may well be accepted as presented, but it is unlikely to
> do so with any clear consensus.
The character ß has been problematic in software support from the start
string expansion upon uppercasing) precisely because the 1901 orthography
represents a snapshot in the evolution of the use of this character in
The reasons for the minority orthographic practice of retaining the ß in
context have to do with the same interim nature of the solution recommended
at that time. As a *universal* character encoding standard, Unicode must
remain neutral as far as further development of German orthography is
whether sanctioned or unofficial, but must cater to both.
Reading this rather lengthy set of counter-arguments has convinced me
that the correct
starting point is indeed at the character semantic level, which casts
the question as
to what form should an ß have that has been retained in ALL UPPERCASE
so as to allow a distinction to SS for semantic reasons.
Once you argue that this form is not the same as that of the lowercase
letter, you then
would have to argue why, suddenly, Unicode decides to depart from
picks a glyph variation solution, rather than an uppercase form. If your
adherence to that aspect of the standard German orthography that the use
in ALL UPPERCASE TEXT (in whatever form) is meant to repudiate, then that's
a poor argument indeed.
In conclusion, I find myself on the side of the "semanticists" and urge
find a way to approve this proposal as presented.
This archive was generated by hypermail 2.1.5 : Mon May 07 2007 - 18:37:03 CDT