From: Mark Davis (mark.davis@icu-project.org)
Date: Thu May 03 2007 - 17:33:17 CST
In practice, I don't think this new character need cause any particular
problems for searching. It can compatibly have the relation
lowercase(capital-ß) = ß
That means that we would make it a case-folding variant of ß, and a
collation variant of ß. We would still keep the uppercase mapping:
uppercase(ß) = SS
Mark
On 5/3/07, John Hudson <john@tiro.ca> wrote:
>
> I've been thinking about this, and it seems to me that the proposed
> encoding isn't ideal.
> What it basically amounts to is a glyph encoding for a variant form that
> some people like
> to use in place of the orthographically normative sequence SS. Although it
> is proposed to
> be named as a CAPITAL LETTER, it is deliberately cut off from any standard
> relationship
> with the lowercase ß, so in this important respect does not operate like a
> letter at all.
> It is a kind of letter-like sign that can be used by people who don't mind
> breaking their
> text -- in terms of spellchecking, case, searching, sorting -- in order to
> display this
> glyph variant.
>
> It is this breaking of the text that is bothering me, perhaps because I'm
> a font developer
> and it took a long time to drag this industry away from the
> hack-text-to-get-glyph-variant
> school of typography and to a place where glyph selection is properly
> separated from text
> encoding.
>
> The fundamental problem of the ß in casing is that, in German, the
> character sequence 'SS'
> does not always equal 'SS' in terms of its relationship to corresponding
> lowercase
> letters. Frankly, this just looks like bad orthographic practice to me,
> and I'm not
> surprised that some people over a century or more have attempted to affect
> a reform of the
> orthography by the introduction of a clean uppercase equivalent to ß. Now
> we're in the odd
> position of trying to find a way to support that failed orthographic
> reform in parallel to
> the normal spelling conventions, as a display option.
>
> I'm wondering if there isn't some way to do this that doesn't necessitate
> breaking text in
> order to achieve the display result. Let's say that someone has some
> lowercase text e.g.
> 'Schriftgießerei', which he decides he wants to set in all caps and using
> the uppercase ß
> glyph. This will require either a custom casing algorithm to bypass the
> standard special
> casing of the ß character, or will require manual intervention. The latter
> is likely to be
> more common, and it is necessary to affect the change before converting
> the text to
> uppercase with standard algorithms, in order to catch the ß before it
> becomes SS. So the
> user goes into the text and converts every occurence of U+0223 to the
> proposed U+1E9E.
> Then he runs the case conversion function in his software to convert the
> rest of the
> letters in the string. It is a bit laborious, but in the end he has what
> he wants...
> unless he wants the result to be searchable as 'SCHRIFTGIESSEREI' or in a
> caseless search
> as 'schriftgießerei'.
>
> [The proposal recommends for discussion a possible compatibility
> decomposition to 'U+0053
> U+0053' to 'provide for the equivalence of the character sequences
> "capital ß" and "SS" in
> those applications that use the Normalization Form KD or KC for the
> detection of sameness
> of names etc.' How viable is this?]
>
>
> My own line of thinking is leaning towards a different approach, which
> would still require
> custom casing or manual intervention to achieve the desired display
> result, but would do
> so without breaking the text or contravening German orthographic rules.
> Rather than
> substituting for ß a nominal uppercase letter character that doesn't
> behave like an
> uppercase letter in its relationship to the lowercase, the user would
> substitute the
> orthographically correct SS but with an intervening, ignorable control
> character that
> would indicate the desired visual display, which could be resolved at the
> glyph level as
> appropriate:
>
> S + ZWJ + S
>
> or perhaps
>
> S + CGJ + S
>
> This at once preserves the normative casing of ß for search operations
> etc. while also
> providing a plain text distinction between capitalised double-s and
> sharp-s. Vitally, it
> also provides a mechanism that enables clean font switching with existing
> fonts, which by
> default would display the orthographically correct 'SS' rather than a
> .notdef glyph in the
> absence of glyph support for the uppercase ß form.
>
> John Hudson
>
> --
>
> Tiro Typeworks www.tiro.com
> Gulf Islands, BC tiro@tiro.com
>
> We say our understanding measures how things are,
> and likewise our perception, since that is how we
> find our way around, but in fact these do not measure.
> They are measured. -- Aristotle, Metaphysics
>
>
>
-- Mark
This archive was generated by hypermail 2.1.5 : Thu May 03 2007 - 17:35:44 CST