Re: Dead keys (was: "Re: Monetary decimal separators")

From: Jukka K. Korpela (
Date: Mon Sep 19 2005 - 01:36:32 CDT

  • Next message: Otto Stolz: "Re: "Unicode encoded" button"

    On Sun, 18 Sep 2005, Anto'nio Martins-Tuva'lkin wrote:

    > On 2005.09.18, 07:58, Jukka K. Korpela <> wrote:
    >> Dead keys are an important practical problem. People have difficulties
    >> in learning to use them. People may have used computers for many, many
    >> years without ever realizing how they can use dead keys to type letters
    >> with diacritic marks.
    > Which locales are you refering to?

    I worked long with users who need to write computer programs, commands,
    and other tech stuff that uses the tilde, the grave accent, and the
    circumflex as special characters (e.g., for negation, backquote, and
    exponentiation), with no apparent connection with any use as diacritics.
    Besides, as you know, the glyphs of the tilde and the circumflex don't
    really suggest much that they could be used as diacritics - they are far
    too big and wrongly positioned.

    In such an environment, and in a less technical environment as well,
    people would normally not bother trying to use letters with diacritic
    marks, unless they appear as precomposed (say, as "") in keycaps.
    They simply omit the diacritics. After all, major publishing companies do
    that routinely as well - perhaps even as a matter of decided policy, not
    just lazyness.

    Thus, when in the course of events someone wants or needs to type a letter
    with a diacritic, he will look for various methods like Character Map or
    Alt-something, never realizing that some keys on his keyboard are dead
    keys that could be used conveniently. Unless someone tells him, of course.

    > My experience with coputer unsavvy
    > people in Portugal is quite the opposite: Being used to type, say, [dead
    > acute] [a] for U+00E1, some are truely shocked when they found out that it
    > is not possible (with a portuguese keyboard) to get a U+0107 by typing
    > [dead acute] [c] (this letter is not used in Portuguese, so most people
    > here almost never need to type it, anyway).

    I can understand that. I think the difference is that normal writing of
    Portuguese requires several different letters with diacritic marks and the
    Portuguese keyboard does not contain all of them as precomposed
    characters. Thus, people _need_ to learn to use dead keys for quite normal
    texts, even if they contain no foreign words. On the other hand, if a
    language normally needs just a few characters (like "" and "" only)
    and they exist on the keyboard, perhaps with keys of their own, there is
    much less need to learn about the dead keys.

    The example you mention, U+0107, illustrates well the problems of with
    dead keys. In a Unicode environment, it would be natural to extend their
    functionality, but this implies some problems too. If any combination of
    dead acute and a letter would produce an accented letter, if the
    combination exists in Unicode as a precomposed character, it would be
    easier to type foreign words and special notations - but it would also
    produce effects that are unexpected to many people.

    For example, if someone is accustomed to using the acute accent as a
    single quotation mark, as in cat, the extended functionality would turn
    c into a c with acute. Similarly, people who are used to writing URLs
    like by just using the tilde key, without
    knowing or caring about its being really a dead key, would be surprised at
    seeing the ~e change to e with tilde. Anything that conflicts with
    people's _habits_ of typing means problems and resistance.

    The change might be worth it, at least in the long run, but users would
    need to be informed about it, and that's tough. Perhaps this should be a
    user-settable option with a default set, at least for some time, to the
    old behavior that people are used to. The extended functionality could
    then be advertized as helping people if they need to type foreign
    characters, rather than as a surprise and change to the customary.

    The extended behavior could work by different criteria, exemplified with
    the following (white [acute] means a dead acute key, [] means the
    (spacing) acute accent character, and [?] is any character):
    - [acute][?] produces [?] with acute whenever this combination
       exists as a precomposed character in Unicode, otherwise it
       produces [?][]
    - as above, but with "whenever this combination exists as a precomposed
       character in a set of characters specified by locale settings, or
       by explicit user settings"
    - [acute][?] always produces [?][combining acute accent], which might
       then be replaced by a precomposed character by NFC rules

    (Especially in the third approach, it would be more logical to make the
    dead keys really keys for combining diacritic marks. Most people would
    probably find it more natural to add an accent _after_ typing a base
    letter, at least if they had no experience with how dead keys work.
    But this would probably be too big a change now.)

    The second, intermediate approach could use CLDR data about use of
    characters in different locales. But I think it would be a compromise that
    combines the drawbacks of the simpler alternatives. What matters here is
    not the user's native language but the _combination_ of languages he uses,
    and describing that would be practically difficult. Besides, the approach
    would make the surprise effect bigger: if you are accustomed to using dead
    keys in quite a many combinations with base characters, it will be awkward
    to note that some accented characters that are rare in your
    environment cannot be typed in that simple, convenient way,
    for no obvious reason.

    Jukka "Yucca" Korpela,

    This archive was generated by hypermail 2.1.5 : Mon Sep 19 2005 - 01:39:52 CDT