[Unicode]  Technical Reports
 

Unicode Technical Standard #35

Unicode Locale Data Markup Language (LDML)
Part 7: Keyboards

Version 26
Editors Steven Loomis (srl@icu-project.org) and other CLDR committee members

For the full header, summary, and status, see Part 1: Core

Summary

This document describes parts of an XML format (vocabulary) for the exchange of structured locale data. This format is used in the Unicode Common Locale Data Repository.

This is a partial document, describing keyboard mappings. For the other parts of the LDML see the main LDML document and the links above.

Status

This document has been reviewed by Unicode members and other interested parties, and has been approved for publication by the Unicode Consortium. This is a stable document and may be used as reference material or cited as a normative reference by other specifications.

A Unicode Technical Standard (UTS) is an independent specification. Conformance to the Unicode Standard does not imply conformance to any UTS.

Please submit corrigenda and other comments with the CLDR bug reporting form [Bugs]. Related information that is useful in understanding this document is found in the References. For the latest version of the Unicode Standard see [Unicode]. For a list of current Unicode Technical Reports see [Reports]. For more information about versions of the Unicode Standard, see [Versions].

Parts

The LDML specification is divided into the following parts:

Contents of Part 7, Keyboards

1 Keyboards

The CLDR keyboard format provides for the communication of keyboard mapping data between different modules, and the comparison of data across different vendors and platforms. The standardized identifier for keyboards can be used to communicate, internally or externally, a request for a particular keyboard mapping that is to be used to transform either text or keystrokes. The corresponding data can then be used to perform the requested actions.

For example, a web-based virtual keyboard may transform text in the following way. Suppose the user types a key that produces a "W" on a qwerty keyboard. A web-based tool using an azerty virtual keyboard can map that text ("W") to the text that would have resulted from typing a key on an azerty keyboard, by transforming "W" to "Z". Such transforms are in fact performed in existing web applications.

The data can also be used in analysis of the capabilities of different keyboards. It also allows better interoperability by making it easier for keyboard designers to see which characters are generally supported on keyboards for given languages.

To illustrate this specification, here is an abridged layout representing the English US 101 keyboard on the Mac OSX operating system (with an inserted long-press example). For more complete examples, and information collected about keyboards, see keyboard data in XML.

<keyboard locale="en-t-k0-osx">
<version platform="10.4" number="$Revision: 8294 $" />
<generation date="$Date: 2013-03-09 01:08:49 -0800 (Sat, 09 Mar 2013) $" />
<names>
<name value="U.S." />
</names>
<keyMap>
<map iso="E00" to="`" />
<map iso="E01" to="1" />
<map iso="D01" to="q" />
<map iso="D02" to="w" />
<map iso="D03" to="e" longPress="é è ê ë" />

</keyMap>
<keyMap modifiers="caps">
<map iso="E00" to="`" />
<map iso="E01" to="1" />
<map iso="D01" to="Q" />
<map iso="D02" to="W" />

</keyMap>
<keyMap modifiers="opt">
<map iso="E00" to="`" />
<map iso="E01" to="¡" /> <!-- key=1 -->
<map iso="D01" to="œ" /> <!-- key=Q -->
<map iso="D02" to="∑" /> <!-- key=W -->

</keyMap>
<transforms type="simple">br> <transform from="` " to="`" />
<transform from="`a" to="à" />
<transform from="`A" to="À" />
<transform from="´ " to="´" />
<transform from="´a" to="á" />
<transform from="´A" to="Á" />
<transform from="˜ " to="˜" />
<transform from="˜a" to="ã" />
<transform from="˜A" to="Ã" />

</transforms>
</keyboard>

And its associated platform file (which includes the hardware mapping):

<platform id="osx">
<hardwareMap>
<map keycode="0" iso="C01" />
<map keycode="1" iso="C02" />
<map keycode="6" iso="B01" />
<map keycode="7" iso="B02" />
<map keycode="12" iso="D01" />
<map keycode="13" iso="D02" />
<map keycode="18" iso="E01" />
<map keycode="50" iso="E00" />
</hardwareMap>
</platform>

1.1 Goals and Nongoals

Some goals of this format are:

  1. Make the XML as readable as possible.
  2. Represent faithfully keyboard data from major platforms: it should be possible to create a functionally-equivalent data file (such that given any input, it can produce the same output).
  3. Make as much commonality in the data across platforms as possible to make comparison easy.

Some non-goals (outside the scope of the format) currently are:

  1. Display names or symbols for keycaps (eg, the German name for "Return"). If that were added to LDML, it would be in a different structure, outside the scope of this section.
  2. Advanced IME features, handwriting recognition, etc.
  3. Roundtrip mappings—the ability to recover precisely the same format as an original platform's representation. In particular, the internal structure may have no relation to the internal structure of external keyboard source data, the only goal is functional equivalence.
  4. More sophisticated transforms, such as for Indic character rearrangement. It is anticipated that these would be added to a future version, after working out a reasonable representation.

Note: During development of this section, it was considered whether the modifier RAlt (=AltGr) should be merged with Option. In the end, they were kept separate, but for comparison across platforms implementers may choose to unify them.

1.2 Definitions

Keyboard: The physical keyboard.

Key: A key on a physical keyboard.

Modifier: A key that is held to change the behavior of a keyboard. For example, the "Shift" key allows access to upper-case characters on a US keyboard. Other modifier keys include but is not limited to: Ctrl, Alt, Option, Command and Caps Lock.

Key code: The integer code sent to the application on pressing a key.

ISO position: The corresponding position of a key using the ISO layout convention where rows are identified by letters and columns are identified by numbers. For example, "D01" corresponds to the "Q" key on a US keyboard. For the purposes of this document, an ISO layout position is depicted by a one-letter row identifier followed by a two digit column number (like "B03", "E12" or "C00"). The following diagram depicts a typical US keyboard layout superimposed with the ISO layout indicators (it is important to note that the number of keys and their physical placement relative to each-other in this diagram is irrelevant, rather what is important is their logical placement using the ISO convention):keyboard layout example showing ISO key numbering

One may also extend the notion of the ISO layout to support keys that don't map directly to the diagram above (such as the Android device - see diagram). Per the ISO standard, the space bar is mapped to "A03", so the period and comma keys are mapped to "A02" and "A04" respectively based on their relative position to the space bar. Also note that the "E" row does not exist on the Android keyboard.

keyboard layout example showing extension of ISO key numbering

If it becomes necessary in the future, the format could extend the ISO layout to support keys that are located to the left of the "00" column by using negative column numbers "-01", "-02" and so on, or 100's complement "99", "98",...

Hardware map: A mapping between key codes and ISO layout positions.

Base character: The character emitted by a particular key when no modifiers are active. In ISO terms, this is group 1, level 1.

Base map: A mapping from the ISO positions to the base characters. There is only one base map per layout. The characters on this map can be output by not using any modifier keys.

Key map: The basic mapping between ISO positions and the output characters for each set of modifier combinations associated with a particular layout. There may be multiple key maps for each layout.

Transform: A transform is simply a combination of key presses that gets transformed into one (or more) final characters. For example, in most latin keyboards hitting the "^" dead-key followed by the "e" key produces "ê".

Layout: A layout is the overall keyboard configuration for a particular locale. Within a keyboard layout, there is a single base map, one or more key maps and one or more transforms.

1.3 File and Directory Structure

Each platform has its own directory, where a "platform" is a designation for a set of keyboards available from a particular source, such as Windows or Chromeos. This directory name is the platform name (see Table 2 located further in the document). Within this directory there are two types of files:

  1. A single platform file (see XML structure for Platform file), this file includes a mapping of hardware key codes to the ISO layout positions. This file is also open to expansion for any configuration elements that are valid across the whole platform and that are not layout specific. This file is simply called _platform.xml.
  2. Multiple layout files named by their locale identifiers. (eg. lt-t-k0-chromeos.xml or ne-t-k0-windows.xml).

Keyboard data that is not supported on a given platform, but intended for use with that platform, may be added to the directory /und/. For example, there could be a file /und/lt-t-k0-chromeos.xml, where the data is intended for use with ChromeOS, but does not reflect data that is distributed as part of a standard ChromeOS release.

1.4 Element Hierarchy - Layout File

1.4.1 Element: keyboard

This is the top level element. All other elements defined below are under this element.

Syntax

<keyboard locale="{locale ID}">

{definition of the layout as described by the elements defined below}

</keyboard>

Attribute: locale (required)

This mandatory attribute represents the locale of the keyboard using Unicode locale identifiers (see LDML) - for example 'el' for Greek. Sometimes, the locale may not specify the base language. For example, a Devanagari keyboard for many languages could be specified by BCP-47 code: 'und-Deva'. For details, see Keyboard IDs .

Examples (for illustrative purposes only, not indicative of the real data)

<keyboard locale="ka-t-k0-qwerty-windows">
  …
</keyboard>
<keyboard locale="fr-CH-t-k0-android">
  …
</keyboard>

1.4.2 Element: version

Element used to keep track of the source data version.

Syntax

<version platform=".." revision="..">

Attribute: platform (required)

The platform source version. Specifies what version of the platform the data is from. For example, data from Mac OSX 10.4 would be specified as platform="10.4". For platforms that have unstable version numbers which change frequently (like Linux), this field is set to an integer representing the iteration of the data starting with "1". This number would only increase if there were any significant changes in the keyboard data.

Attribute: number (required)

The data revision version.

Attribute: cldrVersion (fixed by DTD)

The CLDR specification version that is associated with this data file. This value is fixed and is inherited from the DTD file and therefore does not show up directly in the XML file.

Example

<keyboard locale="..-osx">

<version platform="10.4" number="1"/>

</keyboard>


1.4.3 Element: generation

Element used to keep track of the generation date of the data.

Syntax

<generation date="..">

Attribute: date (required)

The date the data was generated.

Example

<keyboard locale="..">

<generation date="$Date: 2013-03-09 01:08:49 -0800 (Sat, 09 Mar 2013) $"/>

</keyboard>


1.4.4 Element: names

Element used to store any names given to the layout by the platform.

Syntax

<names>

{set of name elements}

</names>

1.4.5 Element: name

A single name given to the layout by the platform.

Syntax

<name value="..">

Attribute: value (required)

The name of the layout.

Example

<keyboard locale="bg-t-k0-windows-phonetic-trad">

<names>

<name value="Bulgarian (Phonetic Traditional)"/>

</names>

</keyboard>


1.4.6 Element: settings

An element used to keep track of layout specific settings. This element may or may not show up on a layout. These settings reflect the normal practice on the platform. However, an implementation using the data may customize the behavior. For example, for transformFailures the implementation could ignore the setting, or modify the text buffer in some other way (such as by emitting backspaces).

Syntax

<settings [fallback="omit"] [transformFailure="omit"] [transformPartial="hide"]>

Attribute: fallback="omit" (optional)

The presence of this attribute means that when a modifier key combination goes unmatched, no output is produced. The default behavior (when this attribute is not present) is to fallback to the base map when the modifier key combination goes unmatched.

If this attribute is present, it must have a value of omit.

Attribute: transformFailure="omit" (optional)

This attribute describes the behavior of a transform when it is escaped (see the transform element in the Layout file for more information). A transform is escaped when it can no longer continue due to the entry of an invalid key. For example, suppose the following set of transforms are valid:

^e → ê

^a → â

Suppose a user now enters the "^" key then "^" is now stored in a buffer and may or may not be shown to the user (see the partial attribute).

If a user now enters d, then the transform has failed and there are two options for output.

1. default behavior - "^d"

2. omit - "" (nothing and the buffer is cleared)

The default behavior (when this attribute is not present) is to emit the contents of the buffer upon failure of a transform.

If this attribute is present, it must have a value of omit.

Attribute: transformPartial="hide" (optional)

This attribute describes the behavior the system while in a transform. When this attribute is present then don't show the values of the buffer as the user is typing a transform (this behavior can be seen on Windows or Linux platforms).

By default (when this attribute is not present), show the values of the buffer as the user is typing a transform (this behavior can be seen on the Mac OSX platform).

If this attribute is present, it must have a value of hide.

Example

<keyboard locale="bg-t-k0-windows-phonetic-trad">

<settings fallback="omit" transformPartial="hide">

</keyboard>

Indicates that:

  1. When a modifier combination goes unmatched, do not output anything when a key is pressed.
  2. If a transform is escaped, output the contents of the buffer.
  3. During a transform, hide the contents of the buffer as the user is typing.

1.4.7 Element: keyMap

This element defines the group of mappings for all the keys that use the same set of modifier keys. It contains one or more map elements.

Syntax

<keyMap [modifiers="{Set of Modifier Combinations}"]>

{a set of map elements}

</keyMap>

Attribute: modifiers (optional)

A set of modifier combinations that cause this key map to be "active". Each combination is separated by a space. The interpretation is that there is a match if any of the combinations match, that is, they are ORed. Therefore, the order of the combinations within this attribute does not matter.

A combination is simply a concatenation of words to represent the simultaneous activation of one or more modifier keys. The order of the modifier keys within a combination does not matter, although don't care cases are generally added to the end of the string for readability (see next paragraph). For example: "cmd+caps" represents the Caps Lock and Command modifier key combination. Some keys have right or left variant keys, specified by a 'R' or 'L' suffix. For example: "ctrlR+caps" would represent the Right-Control and Caps Lock combination. For simplicity, the presence of a modifier without a 'R' or 'L' suffix means that either its left or right variants are valid. So "ctrl+caps" represents the same as "ctrlL+ctrlR?+caps ctrlL?+ctrlR+caps"

A modifier key may be further specified to be in a "don't care" state using the '?' suffix. The "don't care" state simply means that the preceding modifier key may be either ON or OFF. For example "ctrl+shift?" could be expanded into "ctrl ctrl+shift".

Within a combination, the presence of a modifier WITHOUT the '?' suffix indicates this key MUST be on. The converse is also true, the absence of a modifier key means it MUST be off for the combination to be active.

Here is an exhaustive list of all possible modifier keys:

Possible Modifier Keys

Modifier Keys

 

Comments

altL

altR

xAlty → xAltR+AltL? xAltR?AltLy

ctrlL

ctrlR

ditto for Ctrl

shiftL

shiftR

ditto for Shift

optL

optR

ditto for Opt

caps

 

Caps Lock

cmd

 

Command on the Mac

All sets of modifier combinations within a layout are disjoint with no-overlap existing between the key maps. That is, for every possible modifier combination, there is at most a single match within the layout file. There are thus never multiple matches. If no exact match is available, the match falls back to the base map unless the fallback="omit" attribute in the settings element is set, in which case there would be no output at all.

To illustrate, the following example produces an invalid layout because pressing the "Ctrl" modifier key produces an indeterminate result:

<keyMap modifiers="ctrl+shift?">

</keyMap>

<keyMap modifiers="ctrl">

</keyMap>

Modifier Examples:

<keyMap modifiers="cmd?+opt+caps?+shift" />

Caps-Lock may be ON or OFF, Option must be ON, Shift must be ON and Command may be ON or OFF.

<keyMap modifiers="shift caps" fallback="true" />

Caps-Lock must be ON OR Shift must be ON. Is also the fallback key map.

If the modifiers attribute is not present on a keyMap then that particular key map is the base map.

1.4.8 Element: map

This element defines a mapping between the base character and the output for a particular set of active modifier keys. This element must have the keyMap element as its parent.

If a map element for a particular ISO layout position has not been defined then if this key is pressed, no output is produced.

Syntax

<map
 iso="{the iso position}"
 to="{the output}"
 [longPress="{long press keys}"]
 [transform="no"]
/><!-- {Comment to improve readability (if needed)} -->

Attribute: iso (exactly one of base and iso is required)

The iso attribute represents the ISO layout position of the key (see the definition at the beginning of the document for more information).

Attribute: to (required)

The to attribute contains the output sequence of characters that is emitted when pressing this particular key. Control characters, whitespace (other than the regular space character) and combining marks in this attribute are escaped using the \u{...} notation.

Attribute: longPress (optional)

The longPress attribute contains any characters that can be emitted by "long-pressing" a key, this feature is prominent in mobile devices. The possible sequences of characters that can be emitted are whitespace delimited. Control characters, combining marks and whitespace (which is intended to be a long-press option) in this attribute are escaped using the \u{...} notation.

Attribute: transform="no" (optional)

The transform attribute is used to define a key that never participates in a transform but its output shows up as part of a transform. This attribute is necessary because two different keys could output the same characters (with different keys or modifier combinations) but only one of them is intended to be a dead-key and participate in a transform. This attribute value must be no if it is present.

For example, suppose there are the following keys, their output and one transform:

E00 outputs `

Option+E00 outputs ` (the dead-version which participates in transforms).

`e → è

Then the first key must be tagged with transform="no" to indicate that it should never participate in a transform.

Comment: US key equivalent, base key, escaped output and escaped longpress

In the generated files, a comment is included to help the readability of the document. This comment simply shows the English key equivalent (with prefix key=), the base character (base=), the escaped output (to=) and escaped long-press keys (long=). These comments have been inserted strategically in places to improve readability. Not all comments include include all components since some of them may be obvious.

Examples

<keyboard locale="fr-BE-t-k0-windows">

<keyMap modifiers="shift">
<map iso="D01" to="A" /> <!-- key=Q -->
<map iso="D02" to="Z" /> <!-- key=W -->
<map iso="D03" to="E" />
<map iso="D04" to="R" />
<map iso="D05" to="T" />
<map iso="D06" to="Y" />

</keyMap>

</keyboard>
<keyboard locale="ps-t-k0-windows">

<keyMap modifiers='altR+caps? ctrl+alt+caps?'>
<map iso="D04" to="\u{200e}" /> <!-- key=R base=ق -->
<map iso="D05" to="\u{200f}" /> <!-- key=T base=ف -->
<map iso="D08" to="\u{670}" /> <!-- key=I base=ه to= ٰ -->

</keyMap>

</keyboard>

1.4.9 Element: transforms

This element defines a group of one or more transform elements associated with this keyboard layout. This is used to support dead-keys using a straightforward structure that works for all the keyboards tested, and that results in readable source data.

There can be multiple <transforms> elements; at this point the "simple" one is defined.

Syntax

<transforms type="...">

{a set of transform elements}

</transforms>

Attribute: type (required)

The value is "simple" for the transforms listed below. People have legitimate needs for more complex transforms, and more sophisticated types of transforms may be added over time. (Doing the more sophisticated transforms would take much more time, since it would require a thorough survey of the major keyboard mechanisms that use them, development of a unified mechanism that handles all the requirements, and coding to ensure sure programmatically mapping those mechanisms into the standard is possible, and so on.)

1.4.10 Element: transform

This element must have the transforms element as its parent. This element represents a single transform that may be performed using the keyboard layout. A transform is simply a combination of key presses that gets transformed into one (or more) final characters. For example, in most French keyboards hitting the "^" dead-key followed by the "e" key produces "ê".

Syntax

<transform from="{combination of characters}" to="{output}">

Attribute: from (required)

This is the combination of keys that must be pressed in order to activate this transform. Each character in this series of characters must match a character that is located in some chars attribute in the document.

For example, suppose there are the following transforms:

^e → ê

^a → â

^o → ô

If the user types a key that produces "^", the keyboard enters a dead state. When the user then types a key that produces an "e", the transform is invoked, and "ê" is output. Suppose a user presses keys producing "^" then "u". In this case, there is no match for the "^u", and the "^" is output if the failure attribute in the transform element is set to emit. If there is no transform starting with "u", then it is also output (again only if failure is set to emit) and the mechanism leaves the "dead" state.

The UI may show an initial sequence of matching characters with a special format, as is done with dead-keys on the Mac, and modify them as the transform completes. This behavior is specified in the partial attribute in the transform element.

Most transforms in practice have only a couple of characters. But for completeness, the behavior is defined on all strings:

  1. If there could be a longer match if the user were to type additional keys, go into a 'dead' state.
  2. If there could not be a longer match, find the longest actual match, emit the transformed text (if failure is set to emit), and start processing again with the remainder.
  3. If there is no possible match, output the first character, and start processing again with the remainder.

Suppose that there is the following transforms:

ab → x

abc → y

abef → z

bc → m

beq → n

Here's what happens when the user types various sequence characters:

Input characters

Result

Comments

ab

 

No output, since there is a longer transform with this as prefix.

abc

y

Complete transform match.

abd

xd

The longest match is "ab", so that is converted and output. The 'd' follows, since it is not the start of any transform.

abeq

xeq

"ab" wins over "beq", since it comes first. That is, there is no longer possible match starting with 'a'.

bc

m

 

Control characters, combining marks and whitespace in this attribute are escaped using the \u{...} notation.

Attribute: to (required)

This attribute represents the characters that are output from the transform. This may be more than one, so you could have <transform from="´A" to="Fred"/>

Control characters, whitespace (other than the regular space character) and combining marks in this attribute are escaped using the \u{...} notation.

Examples

<keyboard locale="fr-CA-t-k0-CSA-osx">
<transforms type="simple">
<transform from="´a" to="á" />
<transform from="´A" to="Á" />
<transform from="´e" to="é" />
<transform from="´E" to="É" />
<transform from="´i" to="í" />
<transform from="´I" to="Í" />
<transform from="´o" to="ó" />
<transform from="´O" to="Ó" />
<transform from="´u" to="ú" />
<transform from="´U" to="Ú" />
</transforms>
...
</keyboard>
<keyboard locale="nl-BE-t-k0-chromeos">
<transforms type="simple">
<transform from="\u{30c}a" to="ǎ" /> <!-- ̌a → ǎ -->
<transform from="\u{30c}A" to="Ǎ" /> <!-- ̌A → Ǎ -->
<transform from="\u{30a}a" to="å" /> <!-- ̊a → å -->
<transform from="\u{30a}A" to="Å" /> <!-- ̊A → Å -->
</transforms>
...
</keyboard>

1.5 Element Hierarchy - Platform File

There is a separate XML structure for platform-specific configuration elements. The most notable component is a mapping between the hardware key codes to the ISO layout positions for that platform.

1.5.1 Element: platform

This is the top level element. This element contains a set of elements defined below. A document shall only contain a single instance of this element.

Syntax

<platform>

{platform-specific elements}

</platform>

1.5.2 Element: hardwareMap

This element must have a platform element as its parent. This element contains a set of map elements defined below. A document shall only contain a single instance of this element.

Syntax

<platform>
    <hardwareMap>
        {a set of map elements}
    </hardwareMap>
</platform>

1.5.3 Element: map

This element must have a hardwareMap element as its parent. This element maps between a hardware keycode and the corresponding ISO layout position of the key.

Syntax

<map keycode="{hardware keycode}" iso="{ISO layout position}"/>

Attribute: keycode (required)

The hardware key code value of the key. This value is an integer which is provided by the keyboard driver.

Attribute: iso (required)

The corresponding position of a key using the ISO layout convention where rows are identified by letters and columns are identified by numbers. For example, "D01" corresponds to the "Q" key on a US keyboard. (See the definition at the beginning of the document for a diagram).

Examples

<platform>
<hardwareMap>
<map keycode="2" iso="E01" />
<map keycode="3" iso="E02" />
<map keycode="4" iso="E03" />
<map keycode="5" iso="E04" />
<map keycode="6" iso="E05" />
<map keycode="7" iso="E06" />
<map keycode="41" iso="E00" />
</hardwareMap>
</platform>

1.6 Invariants

Beyond what the DTD imposes, certain other restrictions on the data are imposed on the data.

  1. For a given platform, every map[@iso] value must be in the hardwareMap if there is one (_keycodes.xml)
  2. Every map[@base] value must also be in base[@base] value
  3. No keyMap[@modifiers] value can overlap with another keyMap[@modifiers] value.
    • eg you can't have "RAlt Ctrl" in one keyMap, and "Alt Shift" in another (because Alt = RAltLAlt).
  4. Every sequence of characters in a transform[@from] value must be a concatenation of two or more map[@to] values.
    • eg with <transform from="xyz" to="q"> there must be some map values to get there, such as <map... to="xy"> & <map... to="z">
  5. There must be either 0 or 1 of (keyMap[@fallback] or baseMap[@fallback]) attributes
  6. If the base and chars values for modifiers="" are all identical, and there are no longpresses, that keyMap must not appear (??)
  7. There will never be overlaps among modifier values.
  8. A modifier set will never have ? (optional) on all values
    • eg, you'll never have RCtrl?Caps?LShift?
  9. Every base[@base] value must be unique.
  10. A modifier attribute value will aways be minimal, observing the following simplification rules.

Notation

Notes

Lower case character (eg. )

Interpreted as any combination of modifiers.
(eg. = CtrlShiftOption)

Upper-case character (eg. )

Interpreted as a single modifier key (which may or may not have a L and R variant)
(eg. = Ctrl, = RCtrl, etc..)

Y? ⇔ Y ∨ ∅

Y ⇔ LY ∨ RY ∨ LYRY

Eg. Opt? ⇔ ROpt ∨ LOpt ∨ ROptLOpt
Eg. Opt ⇔ ROpt ∨ LOpt ∨ ROptLOpt

Axiom

Example

xY ∨ x ⇒ xY?

OptCtrlShift OptCtrl → OptCtrlShift?

xRY ∨ xY? ⇒ xY?

xLY ∨ xY? ⇒ xY?

OptCtrlRShift OptCtrlShift? → OptCtrlShift?

xRY? ∨ xY ⇒ xY?

xLY? ∨ xY ⇒ xY?

OptCtrlRShift? OptCtrlShift → OptCtrlShift?

xRY? ∨ xY? ⇒ xY?

xLY? ∨ xY? ⇒ xY?

OptCtrlRShift? OptCtrlShift? → OptCtrlShift?

xRY ∨ xY ⇒ xY

xLY ∨ xY ⇒ xY

OptCtrlRShift OptCtrlShift → OptCtrlShift?

LY?RY?

OptRCtrl?LCtrl? → OptCtrl?

xLY? ⋁ xLY ⇒ xLY?

 

xY? ⋁ xY ⇒ xY?

 

xY? ⋁ x ⇒ xY?

 

xLY? ⋁ x ⇒ xLY?

 

xLY ⋁ x ⇒ xLY?

 

1.7 Data Sources

Here is a list of the data sources used to generate the initial key map layouts:

Platform

Source

Notes

Android

Android 4.0 - Ice Cream Sandwich
(http://source.android.com/source/downloading.html)

Parsed layout files located in packages/inputmethods/LatinIME/java/res

ChromeOS

XKB (http://www.x.org/wiki/XKB)

The ChromeOS represents a very small subset of the keyboards available from XKB.

Mac OSX

Ukelele bundled System Keyboards (http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&id=ukelele)

These layouts date from Mac OSX 10.4 and are therefore a bit outdated

Windows

Generated .klc files from the Microsoft Keyboard Layout Creator (http://msdn.microsoft.com/en-us/goglobal/bb964665)

For interactive layouts, see also http://msdn.microsoft.com/en-us/goglobal/bb964651

1.8 Keyboard IDs

There is a set of subtags that help identify the keyboards. Each of these are used after the "t-k0" subtags to help identify the keyboards. The first tag appended is a mandatory platform tag followed by zero or more tags that help differentiate the keyboard from others with the same locale code.

1.8.1 Principles for Keyboard Ids

The following are the design principles for the ids.

  1. BCP47 compliant.
    1. Eg, "en-t-k0-extended".
  2. Use the minimal language id based on likelySubtags.
    1. Eg, instead of en-US-t-k0-xxx, use en-t-k0-xxx. Because there is <likelySubtag from="en" to="en_Latn_US"/>, en-US → en.
    2. The data is in http://unicode.org/repos/cldr/trunk/common/supplemental/likelySubtags.xml
  3. The platform goes first, if it exists. If a keyboard on the platform changes over time, both are dated, eg bg-t-k0-chromeos-2011. When selecting, if there is no date, it means the latest one.
  4. Keyboards are only tagged that differ from the "standard for each platform". That is, for each language on a platform, there will be a keyboard with no subtags other than the platform.Subtags with a common semantics across platforms are used, such as '-extended', -phonetic, -qwerty, -qwertz, -azerty, …
  5. In order to get to 8 letters, abbreviations are reused that are already in bcp47 -u/-t extensions and in language-subtag-registry variants, eg for Traditional use "-trad" or "-traditio" (both exist in bcp47).
  6. Multiple languages cannot be indicated, so the predominant target is used.
    1. For Finnish + Sami, use fi-t-k0-smi or extended-smi
  7. In some cases, there are multiple subtags, like en-US-t-k0-chromeos-intl-altgr.xml
  8. Otherwise, platform names are used as a guide.

1.9 Platform Behaviors in Edge Cases

Platform

No modifier combination match is available

No map match is available for key position

Transform fails (ie. if ^d is pressed when that transform does not exist)

ChromeOS

Fall back to base

Fall back to character in a keyMap with same "level" of modifier combination. If this character does not exist, fall back to (n-1) level. (This is handled data-generation side).
In the spec: No output

No output at all

Mac OSX

Fall back to base (unless combination is some sort of keyboard shortcut, eg. cmd-c)

No output

Both keys are output separately

Windows

No output

No output

Both keys are output separately