Re: Punctuation symbols for partial cuneiform characters

From: Kenneth Whistler (kenw@sybase.com)
Date: Wed Sep 03 2003 - 14:32:02 EDT

  • Next message: John Cowan: "Re: Punctuation symbols for partial cuneiform characters"

    Well, since Michael is engaged in an all-guns-blazing campaign
    on the public list, I guess I need to weigh in, too.

    > > Don't worry. The scholars aren't using them anyway so there won't be
    > > any disunification cost.

    TBD.

    > > Ah, but one of my minions (laughs hysterically) has pointed out the
    > > following to me:
    > >
    > >> The CEILING brackets are (most commonly) used to denote the ceiling
    > >> function in math. The FLOOR brackets are similarly (most commonly)
    > >> used to denote the floor function in math.
    > >>
    > >> Look at the bottom half of page 4 (the one numbered 4, not counting
    > >> the pages before 1...) of
    > >> http://www.chl.chalmers.se/~kentk/LIA/lia2-draft-ed2.pdf.
    > >> This is conventional mathematical usage.
    > >>
    > >> They are used predominantly in math expressions.

    If Michael (admittedly math-averse) would bother to look at the
    mathematical source document (ISO/IEC 10967-2, Language indepedent
    arithmetic) he cites here from Kent Karlsson, he
    would discover that in math the floor and ceiling characters *are*
    used in bracketing pairs.

    > >> Looks like an inconsistency which can be resolved in two ways:
    > >> 1) Add new punctuation characters and leave these ones as symbols;
    > >
    > >
    > > Yes!

    This assumes the two categories are mutually exclusive. Formally,
    they are, of course, since the General Category is a partition,
    so that if a character has gc=Sm (or gc=So, or anything else), it
    can't also have gc=Po (or gc=Ps, or anything else). But in
    practice the line between actual usage of symbols and punctuation
    is quite fuzzy. There are plenty of symbols (including some
    dingbats) that are used as punctuation in various contexts. And
    in this particular case, the usage of floor and ceiling symbols
    in math does not prevent recognizing that their usage *even in
    math* as bracketing pairs on symbols is delimiter- and punctuation-like
    in practice.

    > >
    > >> 2) Adjust the categories of these ones to Ps.
    > >
    > > No!

    I concur that the General Category assignment does not need
    fiddling with. But in point of fact, assignment of gc=Sm is
    insufficient in actual applications to define details of usage
    and layout.

    One should not draw too many conclusions from the details of the
    preferred glyph shape of floor and ceiling in mathematical
    expressions (taller than Michael wants for corner brackets in
    medieval manuscript textual critical apparatus). Note that even
    the *regular* square brackets, U+005B/U+005D, have distinct layout
    behavior when they occur in mathematical expressions. That is
    insufficient reason to then go insisting that those need to
    be separately encoded as *characters*.

    > >
    > >> And what about bidi mirroring?
    > >
    > >
    > > These should function just like the square brackets.

    They do. On this item, Michael knows not whereof he speaks:

    005B;LEFT SQUARE BRACKET;Ps;0;ON;;;;;Y;OPENING SQUARE BRACKET;;;;
                                  ^^ ^
                                  
    2308;LEFT CEILING;Sm;0;ON;;;;;Y;;;;;
                           ^^ ^
                           
    Both of these are bd=ON (other neutral) and bidi-mirrored=Y. They
    behave identically in terms of bidi.

    The only difference is: General Category Ps versus Sm, which I addressed
    above. Application behavior for these is not going to be automatically
    determined by the Ps/Pe assignments, since not all bracketing pairs
    of characters have those property assignments.
      
    >
    > OK, I think I agree with you now. But this change needs to be
    > implemented quickly before the scholars do start using them. For each
    > scholar like Paul who asks this list before using the characters, there
    > may be many who read the standard and start doing what it tells them to
    > do, even if they don't much like the glyphs.

    TUS 4.0, p. 413:

      "Character images shown in the code charts are not prescriptive.
       In actual fonts, considerable variations are to be expected."
       
    TUS 4.0, p. 414:

      "Designers of high-quality fonts will do their own research into the
       preferred glyphic appearance of Unicode characters. ...
       
      "Many characters have been unified and have different appearances
       in different language contexts. ..."
       
    The latter note can easily be extrapolated to recognizing that the
    use of left/right floor/ceiling as bracket pairs in mathematics and the
    use of left/right ceiling as (corner) bracket pairs in medieval
    textual apparatus represent sufficiently different contexts that
    it is not unreasonable to expect "designers of high-quality fonts"
    to depict them with appropriately distinct appearances.

    Remember, folks, that Unicode is a *plain text* standard. Unless
    medievalists have some pretty compelling reason for *distinguishing*
    in their documents mathematical floor/ceiling notation from
    their textual conventions of corner bracketing, there really
    is nothing standing in the way of using the characters as
    recommended in the standard, except for an aversion to the specific
    design of the glyphs in the most widely available Unicode generic
    fonts.

    > In fact it is probably
    > already too late as that note has been printed in thousands of copies of
    > Unicode 4.0.0 and even it gets reversed in 4.0.1 people will continue to
    > find it in the printed book and follow it.

    Corner brackets have been discussed on this and other lists
    on numbers of occasions before. The text in TUS 4.0 was added
    to guide people to the characters most likely to be appropriate
    for general corner bracket usage, since there are so many
    other possible choices already in the standard. (Note the newly
    added confusables: 23A1/23A4 and 23BE/23CB, as well as the
    old standbys: 231C/231F, 250C/2510, and 300C/300D.)

    Michael may well succeed in a campaign to convince the UTC and
    WG2 to encode yet *another* set of corner-shaped characters
    as his preferred corner brackets to recommend to medievalists
    (or others). But his claim that there won't be any disunification
    cost is wrong, IMO.

    --Ken

    =============================================================

    Please note that for the duration of the Sobig.F worm, the
    email address "kenw@sybase.com" is being blackholed. If you
    wish to contact me by email, please use:
                 ken.whistler @ sybase.com



    This archive was generated by hypermail 2.1.5 : Wed Sep 03 2003 - 15:25:21 EDT