Re: Uppercase ß is coming? (U+1E9E)

From: Asmus Freytag (
Date: Fri May 04 2007 - 13:44:02 CST

  • Next message: Karl Pentzlin: "Re: Uppercase ß is coming? (U+1E9E)"

    I was at the WG2 meeting where the German National Body (represented by
    DIN) presented their proposal.
    For those who have not read the proposal carefully, it does contain a
    formal letter from German federal government
    office disclaiming any problems from their side that would prevent
    proceeding with the proposal.

    The character in question is an interesting case, since it represents an
    orthography that is at variance with the 'official' rules, which require
    that ß in uppercase is represented by SS. In a rule-oriented culture
    like the German one, the documented and continued existence of such a
    deliberate variation from standard orthography is an interesting phenomenon.

    The main purpose of the orthographic variant is to retain the ability to
    make the distinction between ss and ß. This distinction is most
    important for names, as the names themselves may embody exceptions to
    standard rules for usage of the ß character. The proposal documents many
    situations from post office to name registries where even official
    bodies fully support retaining the ß in otherwise all-uppercase text.

    The minority practice may also reflect the fact that substituting the SS
    simply looks 'odd' to many people - not surprisingly so, since
    all-uppercase text is not very common to begin with. (It's less common
    in German than in English, for example, there is no custom of
    uppercasing sections of legal documents). But apparently, the simple use
    of the lowercase form looks equally odd to people, and that perceived
    problem has, for over one hundred years, prompted people writing by
    hand, or typographers, to create shapes that harmonize better with their
    uppercase surroundings.

    It is ironic, that this uppercase form has found use on the cover of
    "Der große Duden" which is, after all, the "bible" on standard German
    orthography (as cited in the proposal). But it underscores that even the
    maintainers of the standard orthography implicitly acknowledged the
    place for this character.

    Normally, we call a shape of a character that's used in uppercase
    context a capital letter. Why does this case suddenly lead people to
    propose all sorts of funky encodings? I think the main reason is simply
    surprise at the mere existence of this character has been unknown to
    people who have been solidly lured into thinking that Unicode (or even
    8859-1) provided complete coverage for modern German writing practice.
    To that I note the case of the "commercial minus" which was quite
    recently encoded in Unicode based primarily on evidence from German (but
    also other European) usage.

    The other reason, I think, is the fact that implementers, whether
    German-speaking or not, are thoroughly familiar with the standard
    orhtographic uppercasing behavior of ß. That tends to make everybody an
    'expert' on this character. As Mark and Ken have pointed out, there is a
    way to assign character properties to the uppercase ß that allows it to
    be reccognized as an uppercase form, allows ß to be recognized as its
    lower case and allows it to be searched and sorted somewhat like ß, ss
    or SS (unless tailored for a particular level of distinction).

    Encoding the uppercase ß as a character with these properties will make
    it accessible in environments other than handwriting or special
    "high-design" typography. That's all to the good - the kinds of
    restrictions that were forced upon users by limited technologies like
    the typewriter should be firmly something of the past. It may even
    become the character of choice to represent the ß in official documents
    where names are written in all uppercase. However, I suspect that the
    majority use will continue to orient itself on the dictates of the
    standard orthography -- which is made easier by the fact that
    all-uppercase context is so uncommon in standard documents.

    For software vendors, this means that there's no need to treat any
    existing characters or existing data any differently than before, which
    is what matters for compatibility.

    In summary, having listened to all sides of this issue, I see no reason
    why the proposal to encode this character should not be approved.


    This archive was generated by hypermail 2.1.5 : Fri May 04 2007 - 13:45:13 CST