Re: Conflicting principles

From: Peter Kirk (
Date: Thu Aug 07 2003 - 14:23:10 EDT

  • Next message: Peter Kirk: "Re: Conflicting principles"

    On 06/08/2003 16:12, John Jenkins wrote:

    > On Wednesday, August 6, 2003, at 3:53 PM, Peter Kirk wrote:
    >> This answer presupposes that there is a well-defined concept of which
    >> base character a combining mark belongs to. That is not always true.
    >> The particukar combining mark which precipitated the debate may be
    >> situated above the gap between the (logically and phonetically)
    >> preceding and following characters, or may move on to the preceding
    >> or the following characters depending on the precise context and on
    >> the typographer's preference.
    > If its behavior is substantially different from that of existing
    > combining marks, then there's no reason not to suggest it be added
    > with its own properties. Just don't add it as a combining mark.
    > (This is basically what happened, e.g., with the ideographic
    > description characters.)

    Good idea in principle. The trouble is, it already is a combining mark
    and presumably has to remain so, at least it has to retain its combining
    class! Is there any way we can adjust the properties without encoding a
    new character?

    Ken suggested looking at Tengwar as a tutorial case to refine Unicode
    principles. Well, Hebrew can also be a tutorial case to refine Unicode
    principles. Since Tengwar, and some aspects of Indic scripts, don't fit
    nicely into the conflicting principles, we shouldn't try to squeeze
    Hebrew into this straitjacket as a matter of principle.

    >> Anyway, John J, what code are we talking about that has to work from
    >> the positions of the combining marks back to the underlying
    >> representation? Are you talking about OCR?
    > No, the issue is more how to start from a base form and work forward
    > to encompass the whole series of characters which need to be treated
    > "as one" in certain processes, which can include cursor movement, hit
    > testing, display, line breaking, collation, normalization.

    Thanks for the clarification. I certainly see the issues with cursor
    movement and hit testing. Normalisation is unfortunately so fixed that
    it is not an issue.

    Peter Kirk (personal) (work)

    This archive was generated by hypermail 2.1.5 : Thu Aug 07 2003 - 15:05:18 EDT