RE: Proposal to encode three combining diacritical marks for Low German dialect writing

From: Kent Karlsson (kent.karlsson14@comhem.se)
Date: Mon Jan 21 2008 - 04:06:25 CST

  • Next message: Kent Karlsson: "RE: Proposal to encode three combining diacritical marks for Low German dialect writing"

     
    Kenneth Whistler wrote:
    > > [after: Yoruba]
    > > German dialect writing
    > > The lenght, attachment to the base character or slanting
    > on use with
    > > italic fonts depends on local preference
    >
    > Thanks. The editorial committee will take that and James'
    > suggestion on board to come up with something.

    Some suggestion included "/locale". I don't think anybody's
    locale-setting should have any influence on that.

    ...
    > The Teuthonista hook deliberately doesn't seem to attach to
    > anything, either, which is more like the combining left half
    > ring. Formally, it is being used as a paradigmatic pair
    > with the dot below.

    ...as well as with diearesis below and double right hook below.

    > > I deliberately decided not to refer to Teuthonista in my proposal
    > > draft, but this maybe was wrong.
    > > Teuthonista is used by some universities in Southern Germany. The
    > > users I have contacted are happy with their legacy 8-bit encodings.
    > > Making a proposal to encode Teuthonista would probably rise
    > > interesting discussions, as it would add Tibetan-like
    > stacking to the
    > > Latin script (unless you encode all possible vertical
    > combinations as
    > > single characters, as the legacy 8-bit encodings do).
    >
    > Yes. The stacking of the Grundzeichen (aeiou) could be handled
    > in Unicode plain text because of the set of combining Latin
    > small letters in the standard now. But the generalization of

    Naa. The stacked letters are of equal size, untypical for a
    bese letter - diacritic combination (even of the diacritic is
    a letter).

    > that stacking to the Reduktionsvokale (alpha, schwa, upsilon,
    > dotless-smallcap-i) would pose a problem.

    I would instead suggest that all (its just a handfull) of these
    stacked letter pairs should be encoded as atomic characters
    (no decomposition).

    > Teuthonista also has other characteristics not amenable to
    > plain text, including the indication of grades of nasalization
    > by using a lightface tilde for nasalization, a boldface tilde
    > for strong nasalization, and a lightface tilde enclosed in
    > parentheses for light nasalization.
    >
    > The left-to-right composition of diacritics above implied
    > by the light nasalization convention is also seen in the
    > use of the doubling of the hook we have been talking about
    > to represent a greater degree of openness of vowels.
    > So one dot below for closed, two dots below for very closed,
    > one righthook below for open, two righthooks below for
    > very open. And then even two hooks between parentheses
    > for "somewhat more open".
    > The side-bye-side convention couldn't be accomodated for
    > dialect transcription, whether we decided the hook in question
    > was U+0328, U+031C, or required a new character.

    I don't see why there should be any problem in principle to encode
    COMBINING PARENTHESISED DOUBLE RIGHT HOOK BELOW, COMBINING
    PARENTHESISED DIAERESIS BELOW, COMBINING FAT TILDE, etc.

    Note also that while cedilla and ogonek are not only happenstance
    attached, they are formally attached (by their combining category),
    and can never come below other combining marks below. (Even if you try,
    canonical reordering will move attached marks "past" unattached marks.)
    However, the right hook here referred to CAN come below other marks
    (just like comma below). See the example on page 7 of
    http://www.sprachatlas.phil.uni-erlangen.de/materialien/Teuthonista_Handbuch.pdf.
    (So COMBINING RIGHT HOOK BELOW cannot be unified with COMBINING OGONEK...)

            /kent k



    This archive was generated by hypermail 2.1.5 : Mon Jan 21 2008 - 04:09:29 CST