Re: missing characters: combining marks above runs of more than 2 base letters

From: Ken Whistler <>
Date: Fri, 18 Nov 2011 17:53:37 -0800

On 11/18/2011 5:24 PM, Philippe Verdy wrote:
> This arc in the example is definitely NOT mathematics

Nor did I say it was.

> (even if you
> have read a version where it was attempted to represent it using a
> Math TeX notation in this page, an obvious error because it used an
> angular \widehat and not the appropriate sign).


> This arc is a true
> phonetic mark of a contextual elision (the intermediate letter(s) are
> not to be pronounced, even though they are still written to explicit
> the phonetically elided word(s) and keep their usual orthography).

The fact that the function of the mark is to indicate a contextual
elision is
also essentially irrelevant to the analysis of whether such marking consists
of a mark (character) in text or a mark-up (non-character) of text.

The issue to pay attention to is whether the scoping of the modification of
text is cleanly delimited to a single character at a time, or is in
extensible across n characters.

> Exactly similar to other phonetic symbols like the elision tie (an arc
> adjoininig two words to elide its separating space), or the apostrophe
> (which replaces completely the elided letters).
> And obviously a true candidate for plain-text: it provides
> simultaneouly two readings of the text, one is purely phonetic (and
> accurate for poems that have an essential and very strong rythmic
> structure), another is semantic (by the orthography kept). All letters
> have to be present in some way, even if some of them are marked for
> the expected phonetic.

And is obviously *not* a true candidate for plain text representation.
This kind
of markup for simultaneous alternative readings of text is precisely where
representation by a richer mechanism makes sense. And this is merely the
veriest toe in the water for what I am referring to as "text scoring".

For an example of the complexity of various approaches to these kinds of

And here is an example of a well worked-out, systematic, multi-level
scoring system
for prosodic information, the ToBI annotation conventions:

