Re: Conflicting principles

From: Peter Kirk (
Date: Thu Aug 07 2003 - 15:10:11 EDT

  • Next message: Peter Kirk: "Re: Questions on ZWNBS - for line initial holam plus alef"

    On 06/08/2003 16:13, Michael Everson wrote:

    > At 15:18 -0700 2003-08-06, Kenneth Whistler wrote:
    >> > As someone or other said, "I believe that hitherto -- *hitherto,*
    >> mark
    >>> you -- [we have] entirely overlooked the existence of", well, scripts
    >>> that might cause a conflict between these esteemed principles.
    >> The reason why the UTC should tackle the encoding of Tengwar is not
    >> so much because it would help in the publication of Elvish poetry,
    >> but because confronting the architectural issues it poses for
    >> encoding would make an excellent tutorial case for how the two
    >> principles of combining mark order and
    >> logical order impact the task of coming up with an appropriate
    >> encoding for a complex script. And it would starkly illustrate the
    >> fact that an appropriate character encoding does not necessarily
    >> directly reflect the phonological structure of a language as
    >> represented by that script.
    > Some rather old discussion papers on this topic may be found at
    > and
    > It *is* a problem.

    Thank you, Michael. This is all really interesting. I could get into
    Tengwar, but real scripts are keeping me busy enough!

    There is an interesting parallel between Tengwar tehtar and Hebrew
    holam, in that both can be positioned above either the preceding or the
    following base character without any difference in sense. The difference
    is that in Tengwar the choice depends on the "mode" and so on the
    language being used, and so is consistent within a text; but in Hebrew
    the choice depends on both the exact context and on the typographer's
    preferences. The typographer's preferences should be consistent in a
    text, or at least in a string with the same character formatting, but
    the contextual element means that positioning may vary even within a
    word. In most cases, in Hebrew, the positioning is algorithmically
    determined, although the precise algorithm depends on the typographer,
    but there are a few ambiguous cases. A bit of a nightmare. But it does
    tend to confirm that the UTC need not look at artificial scripts to find
    issues to stretch its thinking, it can find them in real scripts in
    common current use.

    Peter Kirk (personal) (work)

    This archive was generated by hypermail 2.1.5 : Thu Aug 07 2003 - 16:01:01 EDT