Re: Dotted Circle plus Combining Mark as Text

From: Philippe Verdy <>
Date: Tue, 22 Oct 2013 13:57:37 +0200

2013/10/22 Richard Wordingham <>

> On Tue, 22 Oct 2013 01:40:39 +0200
> Philippe Verdy <> wrote:
> > You still don't undestand: I want the composite to behave as if it
> > was a letter that is missing and it is supposed to replace (including
> > in the middle of a word... There's no attempt to insert a line break
> > (in fact I don't want it before or after, unless there are breaking
> > characters around such as punctuation or spaces).
> By almost all that's in the Unicode standard, placeholder base
> character plus combining mark (2 characters in total) should render as
> though the placeholder were a letter. No control character should be
> necessary - gluing them together with WJ would not improve things.
> You still don't understnd : I wnt to determine *which* plceholder chrcter
to use, i.e. WJ here exctly to mke sure tht the combinng mrk will appear in
its "ill-form" with the base dotted glyph. I want control that does not
prohibit the "composite" (control+diacritic) to behve like letter, not
adding any new break opportunity, being as much as possible transparent for
collation except that the diacritic must ut be grouped with the prior
cluster, it must block normlistion reordering, and the resulting glyph must
be spacing (the diacritic will be rendered over the dotted bse glyph. The
composite would behve like the Hangul leading letter IEUNG.

This is why WJ will improve things. There are situations for such use, for
  printf("Your input was <%s>\n", data);
if data starts by combining mark, it would combine with '<', the dotted
base glyph would not be visible. I want to use:
  printf("Your input was <%s%s>\n", placeholder, data);
(other similar situations include input forms, including on the web, where
the plceholder will be placed immediately before the input element,
possibly formated by a javascript, e.g. in an online IME editor, or when
presenting charts)

Which value to use for the placeholder string ? Several candidates:
- U+25CC does not work if the data does not start by combining mark (it
should not be visible at all).
- CGJ does not work (breaks things with asian scripts)
- SHY will work ony if there's no linebreak, but the plceholder glyph may
also not appear at all, it also adds word break.
- WJ seems OK
- and empty BiDi embedding pair of controls could be OK as well
Received on Tue Oct 22 2013 - 07:01:06 CDT

This archive was generated by hypermail 2.2.0 : Tue Oct 22 2013 - 07:01:08 CDT