Re: Proposal to encode three combining diacritical marks for Low German dialect writing

From: Kenneth Whistler (kenw@sybase.com)
Date: Fri Jan 18 2008 - 20:00:54 CST

  • Next message: James Kass: "Re: Proposal to encode three combining diacritical marks for Low German dialect writing"

    > [after: · Yoruba]
    > · German dialect writing
    > · The lenght, attachment to the base character or slanting on use with
    > italic fonts depends on local preference

    Thanks. The editorial committee will take that and James'
    suggestion on board to come up with something.

    >
    > KW> For the record, I agree with Michael and James about the
    > KW> ogonek, as well.
    >
    > Here, I still not stay convinced. Look at:
    > http://www.sprachatlas.phil.uni-erlangen.de/materialien/
    > Teuthonista_Handbuch.pdf
    > (German), where you find hooks to denote openness of vowels (like the
    > one contained in my draft proposal) which are definitely distinct from ogonek.

    > As far as I know now, these hooks are the ones related with the one proposed
    > in my draft.

    I agree, they are probably related. But all the more reason,
    for plain text representation of the content of the transcriptions,
    IMO, to associate these with U+031C COMBINING LEFT HALF RING,
    which is semantically correct for this.

    But there is still no contrast with ogonek demonstrated by
    this, by the way, since Teuthonista uses tildes, not ogoneks,
    to represent nasalization.

    The Teuthonista hook deliberately doesn't seem to attach to
    anything, either, which is more like the combining left half
    ring. Formally, it is being used as a paradigmatic pair
    with the dot below.

    > I deliberately decided not to refer to Teuthonista in my proposal
    > draft, but this maybe was wrong.
    > Teuthonista is used by some universities in Southern Germany. The
    > users I have contacted are happy with their legacy 8-bit encodings.
    > Making a proposal to encode Teuthonista would probably rise
    > interesting discussions, as it would add Tibetan-like stacking to the
    > Latin script (unless you encode all possible vertical combinations as
    > single characters, as the legacy 8-bit encodings do).

    Yes. The stacking of the Grundzeichen (aeiou) could be handled
    in Unicode plain text because of the set of combining Latin
    small letters in the standard now. But the generalization of
    that stacking to the Reduktionsvokale (alpha, schwa, upsilon,
    dotless-smallcap-i) would pose a problem.

    Teuthonista also has other characteristics not amenable to
    plain text, including the indication of grades of nasalization
    by using a lightface tilde for nasalization, a boldface tilde
    for strong nasalization, and a lightface tilde enclosed in
    parentheses for light nasalization.

    The left-to-right composition of diacritics above implied
    by the light nasalization convention is also seen in the
    use of the doubling of the hook we have been talking about
    to represent a greater degree of openness of vowels.
    So one dot below for closed, two dots below for very closed,
    one righthook below for open, two righthooks below for
    very open. And then even two hooks between parentheses
    for "somewhat more open".
    The side-bye-side convention couldn't be accomodated for
    dialect transcription, whether we decided the hook in question
    was U+0328, U+031C, or required a new character.

    If the users of Teuthonista ever decide that their system
    would need to be convertible to Unicode -- as opposed to
    staying in their legacy 8-bit encoding(s) -- I suspect it
    would best be handled with a set of light-weight markup
    conventions on top of the existing encoded characters,
    the way Egyptian hieroglyphic texts will be represented
    in Unicode, rather than trying to smoosh all these strange
    conventions directly into plain text.

    --Ken



    This archive was generated by hypermail 2.1.5 : Fri Jan 18 2008 - 20:02:04 CST