Re: Dotted Circle plus Combining Mark as Text

From: Philippe Verdy <>
Date: Sun, 20 Oct 2013 20:03:00 +0200

OK for ZWSP, I overlooked it because of its name.

In fact I want to include all characters that have some whitespace property
(to investigate), so that renderers would add them automatically in their
list of suitable base characters for which theyr should NOT insert any
dotted circle as the default base holder for combining marks.

Such list of base characters should be documented, and if needed, a new
property added if it cannot be infered only by existing properties.

You indicate the use of the letter (Latin) x as being suitable for indic (Lao),
but all Latin letters are already suitable as the script is already
borrowed in lots of contexts with other scripts. However it is suitable
only for use with diacritics of LTR scripts (so not for Arabic or
Hebrew diacritics,
unlike the multiplication sign which is neutral).

If we define my proposed list as a new property (to be used by renderers and
fonts authors), Latin letters will be excluded due to their strong LTR
properties, or the algorithm should indicate that they are suitable only if
their strong Bidi property is not opposed to the strong Bidi property of
the diacritic.

Most non-combining characters that are spacing and without strong LTR or
RTL bidi property should be in this list by default (but we should exclude
mirrorable characters, notably opening/closing punctuations like parentheses).
The Latin letter x (including its maths varints) should not, IMHO.

2013/10/20 Richard Wordingham <>

> On Sun, 20 Oct 2013 17:17:55 +0200
> Philippe Verdy <> wrote:
> > 2013/10/20 Richard Wordingham <>
> > Interesting, so the list of "place holders" to support increases, we
> > have:
> > - whitespaces (including SP, NBSP, NNBSP, ZWSP, ideographic...)
> > - arabic joiners
> > - U+25CC (possibly also other geometric symbols)
> > - dashes and hyphens
> Add diagonal crosses, such as 'x', 'X' and U+00D7 MULTIPLICATION SIGN;
> they are used for Lao (both Lao script and Tai Tham script), and I have
> seen the cross used for Khmer. U+1D5D1 MATHEMATICAL SANS-SERIF SMALL X
> and U+1D5B7 MATHEMATICAL SANS-SERIF CAPITAL X are particularly suitable
> for the role - serifs are dispreferred.
> Remove ZWSP - it is not a space (it has general category 'other,
> format') and has no width, making it unsuitable as a place holder.
> Richard.
Received on Sun Oct 20 2013 - 13:05:34 CDT

This archive was generated by hypermail 2.2.0 : Sun Oct 20 2013 - 13:05:35 CDT