RE: IPA Null Consonant

From: Kent Karlsson (kentk@md.chalmers.se)
Date: Mon Jun 02 2003 - 15:24:26 EDT

  • Next message: Kent Karlsson: "RE: IPA Null Consonant"

    Ken,

            Thanks for your thorough explanation! Finally something that
    is at least partially convincing! (Of that it *sometimes* is a borrowing
    of
    *symbolism* from set theory notation, that is, nothing more.)

    > Gustav Leunbach (1973), Morphological Analysis as a Step in
    > Automated Syntactic Analysis of a
    > Text. http://acl.ldc.upenn.edu/C/C73/C73-2022.pdf
    > uses an empty set symbol to denote a morphological zero.
    > (see p. 272). [Typographically, this could arguably
    > have been taken from a type tray for a Norwegian ø
    > character, rather than from a mathematical symbol font,
    > but this is *clearly* not a slashed zero.]

    That text uses a lowercase upright o-with-stroke. Other
    similar fragments are fragments of actual spelling (AFAICT),
    and are set is italics. In this case it is the uprightness that
    marks this letter as a meta-notation, and not an object-letter.
    (The text also uses some uppercase (and upright) letters X and Z
    for meta-notation, variables if you like, in a table where other
    ("normalised" to uppercase) letters stand for themselves. That, of
    course, does not make those X and Z into non-letters from a
    Unicode point of view.)

    > A. S. Liberman (1973), Towards a Phonological Algorithm.
    >
    > http://acl.ldc.upenn.edu/C/C73/C73-1015.pdf
    >
    > uses an empty set symbol to denote a phonological zero.
    > (See pp. 196-197 for numerous examples.) These are
    > clear examples, and show that this is used symbolically,
    > to indicate a "something which is not there". Look at
    > the type style. These are included in *italic* word
    > citations, but the null set symbol (used to denote the
    > phonological zero), is *not* set in italic.

    That paper uses an upright uppercase o-with-stroke.
    Again it's the uprightness that signals that this is
    metanotation, while other parts of the example texts
    are set in italics (to signal that it is literal text, or rather
    object-text).

    Note that the empty set can, and has been, typeset as an
    italic U+00D8...

    > Harri Jäppinen and Matti Ylilammi (1986), Associative Model
    > of Morphological Analysis: An Empirical Inquiry.
    >
    > http://acl.ldc.upenn.edu/J/J86/J86-4001.pdf
    >
    > Displays a distinctive usages, with an italic epsilon to
    > denote a morphological zero. (Not the same as the set theory
    > use of epsilon to denote set membership.)

    This is closer to the word theoretic convention of denoting the
    empty string with an (italic) epsilon.

    In the typeset portions of the text, all the Greek letters are
    in italic. Whether this is a deliberate choice or just happenstance
    is not clear. In the portions that appear to be facsimile from a
    manuscript, they are upright. What is important for our discussion
    here, it that the symbol used is a letter (L*), not a math symbol (Sm).

    Just for an example of a linguistic paper that actually uses the
    empty set and the empty set symbol to denote it:
    http://www.linguistics.ucla.edu/people/stabler/elkeps-paris.pdf
    (page 9). They also use hyphen followed by the empty set SYMBOL
    (page 6; note the preceding hyphen) possibly in the meaning
    "empty string" (there is no empty set there!). Indeed, the empty set
    symbol here acts very much like a visible "filler" letter(!).

    http://project.cgm.unive.it/events/papers/yablonski.pdf appears
    to be using ~ (tilde) to denote the empty string.

    http://www.eki.ee/teemad/morfoloogia/kuusik2.html also uses
    epsilon to denote the empty string. So does
    http://assets.cambridge.org/0521631963/sample/0521631963WS.pdf
    (which is also very much oriented towards formal languages, and uses
    that tradition's notations; it also uses an symbol similar to the empty
    set
    symbol as a regular expression atom; p. 11).

    Different authors use different notation. No surprise, that is common,
    especially before the notation has settled. What I think would be a
    mistake though, is to a posteriori try to normalise all of the
    (slightly)
    different notations used to a common one. Then we get into
    transcription
    of meta-notations, and that should be done consciously, not by
    character encoding and later font choice/replacement, which might
    not be under the control of an author, or even (human) editor.

    > > A slashed zero is completely
    > > unrelated to the empty set symbol.
    >
    > This is nonsense.

    Well, no... ;-)

    > You have found the correct citations
    > on the web regarding André Weil's claim to have introduced
    > the empty set symbol, as part of the Bourbaki group. And
    > for Weil, the source of the symbol may well be Norwegian ø.
    > (What the Weil citation doesn't specify is why he chose
    > a symbol vaguely reminiscent of a zero, while not actually
    > being a zero, to represent the empty set.)

    Well, if the motivation really had been "taking the glyph for zero
    and putting a slash over it" that would have been an very easy
    motivation that anyone could have used, as seen by home-cooked
    derivations both from Michael and various type designers. But it
    is NOT the one Weil uses. I you want me to speculate, I too can do
    so (but I do label it as speculation): the ring symbolises a set (think
    of Venn diagrams) and the stroke symbolises its emptiness.

    ...
    > > The empty set symbol and slashed zero remain unrelated.
    >
    > Another bald assertion contradicted by Pullum (1996), who
    > *does* relate them, in linguistic usage. Nobody is claiming
    > that in *mathematical* usage they are connected, or would
    > be acceptable alternative glyphs in a treatise on set theory.

    Well, one reason for not being quiet is that while Pullman sort of
    (but not quite) semantically equates them, Pullman also says (p. 137):
    "Dennisen [...] uses the slightly *different* null [sic] set
    symbol[...]"
    (my emphasis), thus making a definite difference between the
    "null sign" and the empty set symbol. Note also that this
    is listed among other letters in Pullum's list (though there is
    not formal classification of the symbols).

    > What you are missing here is that the use of the empty set
    > symbol in linguistics is associated with structuralist
    > linguistics, which was in intellectual development roughly
    > contemporaneously with the Bourbaki group. And structuralist
    > morphology, in particular, was influenced by formal set
    > theory, and many morphologists borrowed the kind of formalisms
    > used by logicians and set theoreticians.

    Good. Thanks. This does provide a link between the two!

    > A phonological zero or a morphological zero has nothing to
    > do with numeric values, nor is it conceived of as part of
    > a word, per se. It is a pattern gap, an absence, a set with
    > no elements.

    There is still no actual set here (but see PS below)... A "gap", ok.
    But there are characters in Unicode, with general category Lo, to
    denote (other) gaps; and some don't even have any glyph (think of
    the Hangul fillers). And you mentioned yourself (above) the use of
    the *letter* epsilon to denote a morphological "zero".

    > Your mistake is to assume that this derives from some kind
    > of transcriptional usage. It does not. It comes from
    > pattern analysis of structural systems, by structuralist
    > linguists influenced by mathematical formalism and set theory,
    > among other things.

    Ok. That does, however, not mean that an "empty" structure cannot
    be denoted by a letter (like Ø or e).
     
    ...
    > > Then I promise to be very quiet (and nod ok)! ;-)
    >
    > Please read Liberman, and then be very quiet and nod ok.

    Hmm. No claim of set theoretic derivation there...

    I did browse a few papers on linguistics that uses sets,
    and set theoretic notation (some referenced above).
    None that I found claim that the "slashed ring-like" symbol
    in linguistic patterns is an empty set symbol, or that it
    derives from set theory. All of them appear to apply set theory
    correctly though (in contrast to Jarkko's jump to erroneous
    conclusions; see also the ps below).

    ...
    > But the phonological/morphological zero is *NOT* a letter
    > of transcription. It is a symbol which appears in phonological
    > and morphological analysis.

    So would you then consider the use of e (epsilon) erroneous
    for this usage (see above)?

    > Morphologists also embed other
    > symbols in such analyses, including juncture symbols such
    > as "-", "+", "#", "=", and so on. But such practice does
    > not make those symbols letters, either.

    But do they represent anything that *was* a sound/morpheme/word?

                    /kent k

    PS (getting a bit off-topic)

    If you really want to see expressions like -st, -ing, and -Ø (or -e, -Ø,
    or -~), as expressions where juxtapositioning denotes concatenation,
    "ordinary" letters (not meta-letters) standing for a singleton set
    of that letter, and hyphen indicates "fragment for concatenation",
    then what is this "empty" thingie? Can it really be the empty set?
    Well, no, because any concatenation with the empty set results
    in the empty set, which is obviously not what the authors intended.
    Note that concatenation of sets of strings (strings are in formal
    language theory called "words") is defined as the set of strings
    resulting from concatenating all strings in the first set with all
    strings in the second set. If either set is empty, the concatenation
    is thus empty! So the "empty" here, if you want to see it this way,
    must be the singleton set of the empty string (to get the reasonably
    intended result). The set of the empty string is an identity element
    for the concatenation of sets of strings operation. (So, if you like,
    concatenation is like multiplication, the empty set is like 0, and
    the "linguistic null" is like 1!)



    This archive was generated by hypermail 2.1.5 : Mon Jun 02 2003 - 16:17:13 EDT