Re: Cursor movement in Hebrew, was: Non-ascii string processing?

From: Mark E. Shoulson (mark@kli.org)
Date: Thu Oct 09 2003 - 06:46:36 CST


Peter Kirk wrote:

> On 08/10/2003 21:55, Jungshik Shin wrote:
>
>> ...
>>
>> I've got a question about the cursor movement and
>> selection in Hebrew text with such a grapheme (made up of 6 Unicode
>> characters). What would be ordinary users' expectation when delete,
>> backspace, and arrow keys(for cursor movement) are pressed around/in
>> the
>> middle of that DGC? Do they expect backspace/delete/arrow keys to
>> operate
>> _always_ at the DGC level or sometimes do they want them to work at the
>> Unicode character level (or its equivalent in their perception of Hebrew
>> 'letters')? Exactly the same question can be asked of Indic scripts.
>> I've asked this before (discussed the issue with Marco a couple of years
>> ago), but I haven't heard back from native users of Indic scripts.
>>
>> Jungshik
>>
> I can't answer for native users of Hebrew. Maybe others can, but then
> most modern Hebrew word processing is done with unpointed text where
> this is not an issue. But I can speak for what has been done with
> Windows fonts for pointed Hebrew for scholarly purposes.
>
> In each of them, as far as I can remember, delete and backspace delete
> only a single character, not a default grapheme cluster. This is
> probably appropriate for a font used mainly for scholarly purposes,
> where representations of complex grapheme clusters may need to be
> edited to make them exactly correct. A different approach might be
> more suitable for a font commonly used for entering long texts. In
> such a case I would tend to expect backspace to cancel one keystroke -
> but that may be ambiguous of course when editing text which has not
> just been entered.
>
> Cursor movement also works at the character level. In some fonts there
> is no visible cursor movement when moving over a non-spacing
> character, which is probably the default but can be confusing to
> users. At least one font has attempted to place the cursor at
> different locations within the base character e.g. in the middle when
> there are two characters in the DGC, at the 1/3 and 2/3 points when
> there are three characters. But this is likely to get confusing when
> there are 5 or 6 characters in the DGC and their order is not entirely
> predictable.

I'm not a native speaker either, but I do have some occasion to work in
both pointed and unpointed Hebrew, and I think I would disagree with
Peter here. Certainly in the case of cursor movement, I'd expect the
cursor to move by DGCs, and not take some unclear number of keypresses
to move back a letter. With backspace/delete, I would probably want
that to work by characters within the current DGC, but once past that
(or if I'm not doing it immediately after typing the characters) it
should take out whole DGCs. They're just too messy and potentially
randomly ordered for it to make any sense to try to edit them
internally. So I guess I see Hebrew DGCs as also going through a sort
of "commitment" phase, when you type the next base character or use
cursor-movement keys to move around: at that point, the DGC should go
atomic and get deleted all at once, but so long as you're still typing
combining characters (and occasional backspaces), backspace should go
character by character (since you presumably can remember the last few
you just typed).

Mind, I've not actually used all that many pointed-Hebrew text
processors; this is more my idea of how things *should* work than how
they *do* work. I think Yudit does or did something a bit like this,
though. (must have been "did": at the moment it seems to be consistent
about always doing everything by DGC).

~mark



This archive was generated by hypermail 2.1.5 : Thu Jan 18 2007 - 15:54:24 CST