Re: Uppercase is coming? (U+1E9E)

From: Kenneth Whistler (kenw@sybase.com)
Date: Fri May 04 2007 - 14:29:54 CST

  • Next message: John Hudson: "Re: Uppercase ß is coming? (U+1E9E)"

    Marnen Laibow-Koser responded:

    > On May 4, 2007, at 2:53 PM, John Hudson wrote:
    > > Marnen Laibow-Koser wrote:
    > >
    > >> No argument there. There *shouldn't* be such a thing as capital
    > >> . But Unicode is descriptive and not prescriptive. Obviously,
    > >> people are using this misbegotten character, so it needs to have a
    > >> code point.
    > >
    > > Or are they using this misbegotten glyph variant, in which case it
    > > needs to have appropriate glyph level activation?
    > >
    > > It seems to me to be begging the question to assume that it is a
    > > character.
    >
    > It is a character, I think. To assume that it is an uppercase SS
    > ligature is to assume that an uppercase long S exists

    Actually, not at all. I think that is missing the point John is
    trying to make.

    > -- and we have
    > absolutely *no* evidence for that at all.

    Of course.

    > So I think it's begging
    > the question to assume that it is a ligature. Uppercase is
    > attested, if grudgingly so. Uppercase long s is not attested at all!

    First of all, take all this as stipulated:

      1. Uppercase is attested.
      2. Uppercase long s is not attested.
      3. Uppercase is graphologically derived by acquisition of
         a case distinction from the preexisting lowercase ,
         and not be any separate historical ligation of its own.
         (And we don't need to argue whether it should have or
         shouldn't have. See point 1.)
      4. Lowercase is graphologically derived from the ligation
         of long s and z. (And also has at least two distinct
         shape traditions, one of which is known as the "3" shape.)
      5. Despite the graphological origin, in modern German,
         the lowercase is equivalent (for some contexts) to
         a <s, s> sequence, and not to a <long-s, z> sequence.
         
    O.k., if we can stipulate all that, then we don't have to
    argue it all, point-by-point ad naseum. (Of course, if I'm
    wrong about any of that, argue away. ;-) )

    Now, given all that, what we have left is not arguing the
    *existence* of uppercase , but rather the "Is it live, or
    is it Memorex?" question:

       Is uppercase handled better in text processing as a
       *glyph*, without needing a distinct character encoding,
       or is it handled better in text processing as a
       distinctly encoded character (which also would have a
       representative glyph associated with it, of course)?
       
    If you take the first position, as John Hudson has been
    arguing, then the next question would be: "What is the
    glyph for uppercase a visual representation of?"
    And given all the evidence in the proposal, it is pretty
    clear that the answer is: <S, S>, i.e., a sequence of
    two uppercase S's.

    So that would lead to the suggestion (not stipulation, at this
    point):

      6. In modern German, the uppercase is equivalent (for
         some contexts) to an <S, S> sequence, and not to a
         <long-s, Z> sequence or anything else.
         
    Given that suggestion, then following precendent for other
    glyphs as visual representations of sequences of characters,
    it is perfectly reasonable to suggest that "uppercase "
    be *implemented* as a ligature in fonts, and that what
    it would be coded (in the fonts) as is the sequence <S, S>.

    And the next step in the argument is: Supposing you need to
    maintain a distinction in *plain* text between an <S, S>
    sequence which would not "ligate" (i.e., would be shown
    in presentation with a sequence of {S} and {S} glyphs and
    an <S, S> sequence which *would* "ligate" (i.e., would be
    shown in presentation with a single {uppercase } glyph),
    then the standard mechanism for this in Unicode is:

       <S, S> <-- don't ligate by default
       <S, ZWJ, S> <-- ligate if appropriate and if the font in use
                        has an uppercase glyph mapped to
                        this sequence
                        
    O.k.?

    Now back to the regularly scheduled program. :-)

    --Ken



    This archive was generated by hypermail 2.1.5 : Fri May 04 2007 - 14:31:26 CST