From: Jon Hanna (email@example.com)
Date: Thu Aug 07 2003 - 07:32:36 EDT
what code are we talking about that has to work from the
> positions of the combining marks back to the underlying representation?
Such code is not just common and widespread, it is practically ubiquitous.
The principle of base characters always coming first are used:
Whenever you need to calculate the size of a visual representation of a
Whenever you need to move a caret, or locate the caret position closest to a
Whenever you perform normalisation.
Whenever you insert a substring which may not begin with a base character
into another string.
Whenever you need to guarantee that a portion of streamed text is
sufficiently complete that operations on it won't have to be redone when
more characters are received.
Whenever you need to examine the properties of a character which may change
if combined (e.g. breaking properties can be changed when combined).
This is not code that couldn't necessarily be rewritten to allow cases where
combining marks preceded base characters (though it may become considerably
more complicated, frightfully so in some cases, which in turn would lead
some developers to neglect full support for the scripts that used this new
feature). It is code that is all over the place, much of it would be hard to
track down, and generally unless coders have all nicely isolated the process
of locating combining sequences (and you just know some of them haven't)
it's going to be a mess trying to upgrade.
This doesn't say we should automatically dismiss any proposal to change the
principle, but it does weigh heavily against any such process.
This archive was generated by hypermail 2.1.5 : Thu Aug 07 2003 - 08:22:23 EDT