Re: Why people still want to encode precomposed letters

From: Andrew Cunningham (
Date: Tue Nov 18 2008 - 22:20:15 CST

  • Next message: "Re: Why people still want to encode precomposed letters"


    I stand corrected. Should have thought through my response first.

    I tend to forget about daisy chaining deadkeys for Vietnamese or use of
    deadkeys at all since they only work for some languages, not others (based
    on existence of precomposed forms, etc) and form the point of view of a
    user's undertsanding of their langauge, and their expectations of a
    language, deadkeys can be anti-intuitive and undesireable.


    2008/11/19 Doug Ewell <>

    > Andrew Cunningham wrote:
    > Actually, insisting on precomposed characters may not make things ea sier
    >> for some languages. Just thinking of the practicalities involved. Take
    >> Vietnamese as an example, each combination of vowel and tone mark exists as
    >> a single precomposed character in Unicode.
    >> Then look at Microsoft's keyboard layout for Vietnamese. Due to the design
    >> parameters of keyboard layouts on Windows, Microsoft used combining
    >> diacritics for tone marks.
    > On modern systems, there is no necessary correlation between the decision
    > to encode diacriticized letters in composed or decomposed form, and the
    > number of keystrokes required to type them on the keyboard. A single key
    > can generate two or more characters, which is common in keyboards for Indic
    > languages, or a single character can require two keystrokes, which is common
    > on just about every keyboard in Europe.
    > Microsoft used combining diacritics for tone marks in their Code Page 1258
    > because they only had 256 code points to work with, not because of the
    > number of available keys.
    > I built a customized keyboard using Microsoft Keyboard Layout Creator
    > (MSKLC) [1] that uses dead keys for most diacritics, to maximize the number
    > of possible characters. But because I wanted to be able to type Vietnamese,
    > and dead-key sequences on Windows keyboards can include no more than one
    > dead key (AFAIK), I gave precomposed letters like o-with-circumflex (ô)
    > their own key. Now I can type U+1ED1 (ố) with two keystrokes, which is
    > neither one (the number of Unicode characters for this letter in NFC) nor
    > three (NFD).
    > I'm speaking about all of this from a Windows perspective, but I'm sure it
    > is equally true for Mac, Linux, and other modern systems.
    > [1]
    > --
    > Doug Ewell * Thornton, Colorado, USA * RFC 4645 * UTN #14
    > ˆ

    Andrew Cunningham
    Vicnet Research and Development Coordinator
    State Library of Victoria

    This archive was generated by hypermail 2.1.5 : Tue Nov 18 2008 - 22:22:53 CST