Cursor-Movement was: German Umlaut and other precomposed characters

From: Peter R. Mueller-Roemer (pmr@cs.uni-frankfurt.de)
Date: Tue Apr 26 2005 - 08:38:52 CST

Next message: Dean Snyder: "Re: String name and Character Name"

Previous message: Hans Aberg: "Re: Germa Umlaut (was: String name and Character Name)"
In reply to: Hans Aberg: "Re: Germa Umlaut (was: String name and Character Name)"
Next in thread: Peter Kirk: "Re: Cursor-Movement was: German Umlaut and other precomposed characters"
Reply: Peter Kirk: "Re: Cursor-Movement was: German Umlaut and other precomposed characters"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Several precomposed letters are necessary for easy typing in German,
Skaninavian, Greek, Hebrew ... see the different keyboard-layouts.
Unicode was good in intoducing combining diacritical marks, so that you
don't necessarily have to use a different keyboard for just entering a
few words in another language.
But why do all the precomposed Hebrew dagesh-consonants refuse to be
composed with vowel-points AND cantilation marks - You can' even copy
the first word of the Masoretic bible!
What we need though is good editors that can not only compose characters
with several diacritical marks - without overtyping (e.g. accute + grave
should not merge to a smugy little x ), but find and replace the
sequence by a single precomposed letter. With the result that composed
sequences are counted for cursor movement as single characters. There
should also be an easy way (e.g. by Alt Gr arrow) to enter into any
precomposed letter to insert or delete any marks.
Unicode might not like to address standardization of cursor-movement in
multi-lingual texts with RtoL and LtoR entry, shaping and editing of
combining sequences.
Leaving it to the editor-providers individually will cause head-akes to
those who have to use various OS and SW on several computers.
There should be a technical committee of concerned parties to provide at
least guidelines, for shaping, editing, navigating over and in combining
sequences and in bi-directional texts. the present state is very
unsatisfactory!

Peter R. Mueller-Roemer

Hans Aberg wrote:

> At 15:48 +0200 2005/04/25, Otto Stolz wrote:
>
>> you have written:
>>
>>> The Swedish language symbol ä (a with two dots above) is a separate
>>> letter, not to be viewed as an alteration of the letter a. So it is
>>> atomic. It is reasonable to enter it as a separate character. In
>>> German, however it is an umlaut, alteration of the letter a.
>>
>>
>> Not quite so: It has its own phonetic value (almost equal to its
>> Swedish sibling, IIRC), and is taugh as seperate character in schools
>> (believe me, I am German and interested in linguistic issues, and my
>> wife is a teacher at an elementary school).
>>
>> The term "Umlaut" for a class of characters does not render these
>> umlauts as non-characters. There is a similar term, "Ablaut", e. g.
>> for the "a" and "o" in "barst" and "geborsten" (from "bersten") --
>> yet, this does not qualify "a" and "o" as non-characters, alterations
>> of "e".
>
>
> Let's take it easy: I attempted to make a formal definition of the
> notion of an abstract character, not to be confused with the many
> possible intuitive notions of a character. When defining an abstract
> character, I suggested that it should be a linguistic semantic unit
> that in some sense or another is atomic. There, the point is that
> symbols like ä can be atomized in more ways than one: It could be
> viewed as a whole, indivisible unit, or a composite of more than one
> characters. The choice may depend on the context.
>
> The second point, though, is that the preference for larger symbols be
> viewed as a single character, as regards to computer software,
> probably is due to limitations of this computer software. It would
> probably be better, computer implementationwise, to always represent
> symbols like ä as a combination of smaller, abstract characters, as a
> sufficiently smart computer program always can recognize the Swedish
> or German letter ä, and give it the proper handling, and as we now
> know that the representing of characters in a single or a bibyte will
> not suffice anyhow.

Next message: Dean Snyder: "Re: String name and Character Name"
Previous message: Hans Aberg: "Re: Germa Umlaut (was: String name and Character Name)"
In reply to: Hans Aberg: "Re: Germa Umlaut (was: String name and Character Name)"
Next in thread: Peter Kirk: "Re: Cursor-Movement was: German Umlaut and other precomposed characters"
Reply: Peter Kirk: "Re: Cursor-Movement was: German Umlaut and other precomposed characters"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Tue Apr 26 2005 - 08:40:12 CST