Re: Uppercase ß is coming? (U+1E9E)

From: Mark Davis (mark.davis@icu-project.org)
Date: Thu May 03 2007 - 17:33:17 CST

  • Next message: vunzndi@vfemail.net: "Public Review Issue: #98 still open"

    In practice, I don't think this new character need cause any particular
    problems for searching. It can compatibly have the relation

    lowercase(capital-ß) = ß

    That means that we would make it a case-folding variant of ß, and a
    collation variant of ß. We would still keep the uppercase mapping:

    uppercase(ß) = SS

    Mark

    On 5/3/07, John Hudson <john@tiro.ca> wrote:
    >
    > I've been thinking about this, and it seems to me that the proposed
    > encoding isn't ideal.
    > What it basically amounts to is a glyph encoding for a variant form that
    > some people like
    > to use in place of the orthographically normative sequence SS. Although it
    > is proposed to
    > be named as a CAPITAL LETTER, it is deliberately cut off from any standard
    > relationship
    > with the lowercase ß, so in this important respect does not operate like a
    > letter at all.
    > It is a kind of letter-like sign that can be used by people who don't mind
    > breaking their
    > text -- in terms of spellchecking, case, searching, sorting -- in order to
    > display this
    > glyph variant.
    >
    > It is this breaking of the text that is bothering me, perhaps because I'm
    > a font developer
    > and it took a long time to drag this industry away from the
    > hack-text-to-get-glyph-variant
    > school of typography and to a place where glyph selection is properly
    > separated from text
    > encoding.
    >
    > The fundamental problem of the ß in casing is that, in German, the
    > character sequence 'SS'
    > does not always equal 'SS' in terms of its relationship to corresponding
    > lowercase
    > letters. Frankly, this just looks like bad orthographic practice to me,
    > and I'm not
    > surprised that some people over a century or more have attempted to affect
    > a reform of the
    > orthography by the introduction of a clean uppercase equivalent to ß. Now
    > we're in the odd
    > position of trying to find a way to support that failed orthographic
    > reform in parallel to
    > the normal spelling conventions, as a display option.
    >
    > I'm wondering if there isn't some way to do this that doesn't necessitate
    > breaking text in
    > order to achieve the display result. Let's say that someone has some
    > lowercase text e.g.
    > 'Schriftgießerei', which he decides he wants to set in all caps and using
    > the uppercase ß
    > glyph. This will require either a custom casing algorithm to bypass the
    > standard special
    > casing of the ß character, or will require manual intervention. The latter
    > is likely to be
    > more common, and it is necessary to affect the change before converting
    > the text to
    > uppercase with standard algorithms, in order to catch the ß before it
    > becomes SS. So the
    > user goes into the text and converts every occurence of U+0223 to the
    > proposed U+1E9E.
    > Then he runs the case conversion function in his software to convert the
    > rest of the
    > letters in the string. It is a bit laborious, but in the end he has what
    > he wants...
    > unless he wants the result to be searchable as 'SCHRIFTGIESSEREI' or in a
    > caseless search
    > as 'schriftgießerei'.
    >
    > [The proposal recommends for discussion a possible compatibility
    > decomposition to 'U+0053
    > U+0053' to 'provide for the equivalence of the character sequences
    > "capital ß" and "SS" in
    > those applications that use the Normalization Form KD or KC for the
    > detection of sameness
    > of names etc.' How viable is this?]
    >
    >
    > My own line of thinking is leaning towards a different approach, which
    > would still require
    > custom casing or manual intervention to achieve the desired display
    > result, but would do
    > so without breaking the text or contravening German orthographic rules.
    > Rather than
    > substituting for ß a nominal uppercase letter character that doesn't
    > behave like an
    > uppercase letter in its relationship to the lowercase, the user would
    > substitute the
    > orthographically correct SS but with an intervening, ignorable control
    > character that
    > would indicate the desired visual display, which could be resolved at the
    > glyph level as
    > appropriate:
    >
    > S + ZWJ + S
    >
    > or perhaps
    >
    > S + CGJ + S
    >
    > This at once preserves the normative casing of ß for search operations
    > etc. while also
    > providing a plain text distinction between capitalised double-s and
    > sharp-s. Vitally, it
    > also provides a mechanism that enables clean font switching with existing
    > fonts, which by
    > default would display the orthographically correct 'SS' rather than a
    > .notdef glyph in the
    > absence of glyph support for the uppercase ß form.
    >
    > John Hudson
    >
    > --
    >
    > Tiro Typeworks www.tiro.com
    > Gulf Islands, BC tiro@tiro.com
    >
    > We say our understanding measures how things are,
    > and likewise our perception, since that is how we
    > find our way around, but in fact these do not measure.
    > They are measured. -- Aristotle, Metaphysics
    >
    >
    >

    -- 
    Mark
    


    This archive was generated by hypermail 2.1.5 : Thu May 03 2007 - 17:35:44 CST