Re: Caret

From: Philippe Verdy <>
Date: Sun, 25 Nov 2012 01:37:50 +0100

Yes it's true, but for now we are writing all this in the context of the
BiDi algorithms. Complex layouts (which are not just horizontal, e.g. the
layout of diacritics relative to their base character) is not in scope here:

It is recommanded to treat combining sequences as an unbreakable sequence.

But even if editors do not allow positioning an insertion point between a
base character and a following diacritic, they still allow suppressing only
this diacritic, which is not selectionable alone, by positioning the caret
just after the diacritic, when it is not precombined into a decomposable
character, and pressing the BACKSPACE key. The same could be done with
conjoined sequences of jamos by deleting just the last jamo.

Having the possibility to position a caret in the middle of an Hangul
cluster, between two conjoined jamos, is a challenge as it is clear that it
will be difficult to determine the position, size and glyph to use for a
caret positioned in the composed cluster. It will also be difficult to
perform a text selection containing only some middle or final part of a
cluster (but not its first character) at the begining of the selection,
and/or some middle or starting part of a cluster (but not its final
character), because it will even be impossible to display any "caret" but
we'll need a way to represent the selection.

If such selection is needed in editors, then they will most often need to
"visually decompose" the clusters using an horizontal layout (not the
standard layout) using uncomposed glyphs for individual characters (for
combining characters they would be displayed as if they were in defective
sequences, with a dotted circle; for Hangul clusters, their jamos will be
decomposed horizontally, possibly using an alternate glyph showing that
they should still be conjoined on one side or the other), so that they can
be selected. In which case the solution developed for simple horizontal
layouts (including with BiDi) will still work (with only one caret,
or possibly two distinct carets only when there's a change of UBA-resolved
direction on each side of the insertion point).

Zero-width invisible characters (notably most C0 and C1 controls that are
not visible whitespaces or line-breakers; or format controls, such as the
soft hyphen which is most often not displayed except at line breaks) are
creating another challenge: as they are invisible and have no display
width, the position on display before and after them are exactly the same :

 * what must be the position determined in the encoded character stream if
you click where there's a zero-width character ? Before or after them ? Can
a GUI feature (mouse gesture or distinct keystrokes) allow performing this
choice of position (notably for text selection) without having to count
incremental steps using a LEFT or RIGHT arrow key ?

 * It is suggested that the position should avoid being in the middle of a
sequence that is part of the layout, if it helps eliminating some possible
positions, but in some cases there will still remain several alternatives
the the selection will be arbitrary...

 * unless we are in an editing mode that make zero-width controls visible
(with a non-zero-width glyph). In which case we'll return to the simple
horizontal layout (including with BiDi support).

2012/11/23 QSJN 4 UKR <>

> Folks, you are talking about left and right, but don't you forget
> about more complex scripts? How to set the caret inside the ligature —
> if its components are a consonant and a vowel sign with two parts:
> below-base and above-base, for example?? I repeat again what I already
> had written: stop thinking as old typewritermachine: «bum—shift,
> bum—shift». Some peoples use a cursive writing.
Received on Sat Nov 24 2012 - 18:44:27 CST

This archive was generated by hypermail 2.2.0 : Sat Nov 24 2012 - 18:44:35 CST