L2/01-324
Variation Selectors
M. Davis, 2000-08-15
There are 3 variation selectors in Unicode 3.1.1 (180B..180D). 256 others
have been accepted by the UTC, and are being submitted to WG2: FE00..FE0F,
E0110..E01FF (FE00 has already been accepted by WG2).
Here are the properties of the variation selectors:
- General category = Cm (combining mark)
- This assignment percolates into other tied properties, such as in BIDI
class, LineBreak, etc.
- Combining Class = 0
- Joining Class = Transparent
- Representative Glyph = zero-width, invisible glyph.
- As with other zero-width invisible glyphs, implications may allow the
option of displaying the VS characters visibly, such as in a "Show
Hidden" option.
- Special Behavior
- When a specific VS occurs immediately after a specific base character,
as specified in StandardXXX.html in the Unicode character database, the
base character should be displayed with the variant glyph specified in
that file if possible. If not possible, the VS shall have no effect on
the selection of the glyph for that base character.
- If a VS occurs after any other character, it shall have no effect on
the selection of the glyph for that character.
- Policy Invariant: StandardXXX.html will not contain associations
between non-base characters and variation selectors.
- Default Collation behavior: completely ignorable: [.0000.0000.0000.0000]
- Note: a contracting sequence of <base, VS> can be tailored for
specific languages, although this is discouraged.
- Transcoding Implications
- Where legacy standards incorporate glyph variants, the conversion into
Unicode may generate two Unicode code points from one legacy code point,
and the conversion from Unicode may generate one legacy code point from
two Unicode code points.