From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Wed Aug 06 2003 - 18:47:57 EDT
On Wednesday, August 06, 2003 11:48 PM, Peter Kirk <peter.r.kirk@ntlworld.com> wrote:
> OK, what kind of markup should I use, in any well-known markup
> language, to ensure that an isolated diacritic is centred in the
> space between the words before and after it?
In plain text, I think that this encoding:
...endOfWord1, SPACE, SPACE, diacritic, SPACE,
startOfWord2...
is what you need, as it creates the following combining sequences:
<...endOfWord1>, <SPACE>, <SPACE, diacritic>, <SPACE>,
<startOfWord2...>
If you don't want any space around the diacritic which must be displayed
isolated but in the middle of a word, the following would work:
...endOfWord1, SPACE, diacritic, startOfWord2...
Here the SPACE is not a break opportunity, but just the base character
for the diacritic inserted. What is missing in the standard is defining the
property of such SPACE+diacritic sequence: normally it inherits the
properties of the base character, and properties of diacritics are ignored.
But when using a SPACE or NBSP base character new properties may
be needed. If there's still a break opportunity on the base SPACE of a
combining sequence, it is not clear where the break occurs: before the
SPACE (i.e. before the combining sequence), or after the diacritic (i.e.
after the combining sequence)?
I think that the second option applies here, i.e. the base SPACE would
create a break opportunity at end of the whole combining sequence
made with a SPACE and the following combining characters (including
CGJ if needed to fix canonical ordering).
Another similar case would be the use of a isolated nukta (which
normally modifies a following base character): the sequence
<nukta, SPACE> is a single combining sequence with a break
opportunity. So a sequence like <nukta, SPACE, acute accent>
would be unbreakable but would include a break opportunity at its
end, unless it is followed by a NBSP.
And the sequence <nukta, NBSP, acute accent> would also be
unbreakable either in the middle or on both ends.
-- Philippe. Spams non tolérés: tout message non sollicité sera rapporté à vos fournisseurs de services Internet.
This archive was generated by hypermail 2.1.5 : Wed Aug 06 2003 - 19:29:22 EDT