Re: Caret

From: Philippe Verdy <>
Date: Mon, 12 Nov 2012 20:47:22 +0100

Note that this creates an opportunity for encoding something else !

Carets are really true glyphs which should be shown consistantly with the
fonts used in the text content. This glyph is drawn usually on top of this
text (not always, it could also be inserted as an additional "character",
pushing the text around it to make it fit). It has its own identity (as a
caret), but could ultimately be any kind of symbol.

Some caret forms are well known : the vertical line and rectangular block
being the most common. When it is a rectangular block displayed on top of
characters, its size (notably its width) should be the same as the
character over which it is inserted (notably when this caret is used for
edition of text in "overwrite" mode, rather than insertion mode where the
thin vertical line is recommanded).

But why not encoding these carets are true symbols ? Some of them may be
script specific. But with this encoding, the carets found in fonts would
adopt the correct form according to the font design (for example fonts in
italic or oblique styles would map an italic or oblique glyph of their

It would be even better than trying to infer a caret form from a few
font-level properties (like the average character width, or the average
angle of italic, i.e. the main direction of ascenders and descenders,
someting which is not always consistant between all glyphs in a fontn and
that would promote another more usable caret form than just a slanted line,
including even the possibility if using horizontal carets with a much
smaller ascender or descender, or just displaying an arrow head).

I don't want here to pormote specific forms of carets, but instead the
encoding of some abstract classes of carets:

- a direction neutral caret, and two oriented carets (one for the
left-side, another for the right side)
- each one in two kinds : one for the overwrite mode, another for the
insertion mode.

In other words: a total of 6 kinds of carets (or possibly just 4 if the
direction neutral caret can be represented by the left-side caret and the
righ-side caret, so that it will display an orientation only within
bidirectional texts, and fonts for Latin will typically map only a vertical
line or block for the right-side insertion caret that normally appears
after the last character typed or at the beginning of text, and the
left-side overwrite caret would be a basic rectangular block).

These 4 or 6 characters would be encoded only as abstract symbols, without
any dedicated form (the representative glyphs could be chosen using the
Arabic script, in its modern non slanted form, as a guide, using metrics
compatible with Latin, Greek, Cyrillic, and probably sinograms and Hagul as

They could appear in isolation in plain text (where they would not be
flashing, but where they could adopt other styles like bold, italic,
colors, outer borders and decorations like underlines and overlines) for
use in documentation.

When editing Latin texts (and other simple LTR alphabetic scripts, or
sinograms and generic LTR symbols), we usually use either :

- the right-side insertion caret as a "vertical" line, possibly slanted (if
a direction is displayed with an arrow head, it points to the right)

- the lef-side overwrite caret as a rectangular block, possibly slanted (if
a direction is displayed with an arrow head, it points to the right, the
arrow head being connected to the left side of the rectangle and
pointing within the rectangle)

The 1st one is the most frequently used (we generally work in insertion
mode in most editors).

2012/11/12 Philippe Verdy <>

> Carets in bidirectional texts CAN be oriented (meaning that they are not
> necessarily BETWEEN characters, but possibly BEFORE and/or AFTER them).
> Have you seen how the caret behaves in Java applications ? It shows an
> extra triangular arrow head, oriented to the left or right, and connected
> to the top of the vertical line. And it is then really appearing NEARBY the
> character it designates in the indicated direction.
> For more complex scripts, the form of the caret could be more complex (if
> we could position within an Hangul syllabic square, it would have to take
> the form of a corner indicating where in the composition square is the
> previous character, the corner being at the position where the syllable
> will be modified by the insertion of an additional character.
> Carets are not necessarily a simple line or block.
> 2012/11/12 QSJN 4 UKR <>
>> I have a little advise for the text editor designers. I think i am
>> either the stupidest or the smartest man in the universe if i write it
>> :(
>> A caret is a flashing line, block, or other picture in the client area
>> of a window, it indicates the place (between two characters) at which
>> text will be inserted (or the edge of the text to be selected or
>> deleted). What does it mean? Between? There is no "between" in the
>> bidirectional text, the previous and the next character are not
>> necessary nearby!
Received on Mon Nov 12 2012 - 13:50:44 CST

This archive was generated by hypermail 2.2.0 : Mon Nov 12 2012 - 13:50:45 CST