L2/06-265 Suggested vertical property values and suggested assignments to Unicode characters Kent Karlsson 2006-07-31 Some of the scripts covered by Unicode are commonly written in vertical lines. CJK and Hangul is commonly written both horizontally and vertically (with the next line to the left), and Mongolian and (the historic) Phags-Pa are almost written in vertical lines (with the next line to the right). In modern contexts, especially when using Unicode, different scripts may be mixed, but the line orientation cannot (reasonably) change in the middle of a text. Hence, there need to be rules on how to handle glyphs (rotate along, counter-rotate, etc.) for different line orientations. Glyphs aren't (or shouldn't) always rotate along with the line rotation. E.g., CJK glyphs are counter-rotated (from horizontal) to remain upright also in vertical CKJ line orientation. However, there is very little information on exactly which characters are counter-rotated or go through other adjustments for vertical line layout. Unicode has as yet no datafile giving properties telling how each character should be handled for vertical line layout. This is a suggestion to add such a datafile (Vertical.txt) for a new property 'Vertical', with explanations for the 'Vertical' property values. The suggested property values, their interpretation, and their suggested assignments for Unicode characters are given below. --------------------------------------------------- # proposed datafile: Vertical.txt # # Vertical lines: Default glyph rotation properties. # # The property value assignments may be overridden by higher-level # protocols. # # Vertical CJK lines are considered to be rotated -90 degrees, and most # non-CJK glyphs are rotated along with that, while most CJK glyphs are # counter-rotated so they stay "upright". Similarly, Mongolian (and # Phags-pa) vertical lines are considered rotated -90 degrees if bidi is # not done (in particular not handling Mongolian as bidi R), but such # lines are considered rotated +90 degrees if bidi is done (in particular # handling Mongolian as bidi R). (The Mongolian glyphs still come out # the same, due to a 180 degree extra rotation if bidi is not done.) # # This property deals with handling of glyphs for characters for # three line layouts (or maybe even four): # # 1. Horizontal lines, next line below, 0 degrees line rotation. # Glyph layout from left to right (after bidi). # # 2. Vertical lines, next line to the left, -90 degrees line rotation. # Vertical CJK and vertical Hangul. Glyph layout from top to bottom. # # 3. Vertical lines, next line to the right, +90 degrees line rotation # if bidi is done, -90 degrees line rotation if bidi is not done. # Mongolian and Phags-Pa (but also vertical variant of horizontal # scripts, if no bidi done and rt (or rl) interpreted as cr). Glyph # layout top to bottom for -90 degrees, and from bottom to top for # +90 degrees (after bidi). # # And maybe also (4.) vertical lines where the letters are still uppright; # case 3 with no bidi plus interpreting rt or rl as cr; see property value # descriptions below. An Indic subscript or adscript consonant is to be # treated as a combining character (ih). An Indic consonant conjuct ligature # or a Hangul syllable (also when constructed from Jamos) is to be treated # as a base character (rt for Indic, cr for Hangul). If rt, rl, or mo is # interpreted as cr (for -90 degrees line rotation, no bidi, upright glyphs), # that may break any cursive joining, Mn are effectively Mc and Mc are # effectively Mn. # # Other, more unusual, line layouts are not covered. # # Some fonts may include the effects of these properties directly. E.g., # some CJK fonts incorporate glyph counter-rotations (cr, av) and other # placement changes (cm, cl, cu) for -90 degree line rotatation. A layout # engine needs to take into account both the glyph rotation property for # line rotations as well as the font's already incorporated glyph rotation # when rotating glyphs to be layed out on a line. If the font has # precounterrotated (+90 degrees) glyphs (for cr/cm/cu/cl), first nominally # rotate the prerotated glyphs -90 degrees. cm/cu/cl will (nominally) need # additional adjustments. If the line rotation is -90 degrees the net # glyph rotation is then 0 degrees and the additional adjustments should # be cancelled out. Then just taking the vertical line targetted glyphs # as they are is both more efficient and may lead to a better result than # actually going through the motions. If the font does not have precounter- # rotated glyphs and the line layout is vertical or it only has precounter- # rotated glyphs and the line layout is horizontal, then glyph rotation # actually needs to be computed (and, among other things, done so that one # avoids overlapping glyphs in the display of the text). # # The Vertical property values and their interpretations are as follows: # # rt - Rotate with the line rotation. Except for 'rl', 'ih', and 'pu', # 'rt' is the vertical property for all characters not listed below. # # ih - Inherited. All combining characters (Mn, Me, Mc) have this values # for the Vertical property. Their vertical behaviour is inherited # from the base character it is applied to, but the glyph stays # positioned with the base character (not on their own). These are # NOT listed below. Note that Indic vowels that are Mc (spacing # combining) are effectively non-spacing in vertical layout IF the # base is cr/cm/cu/cl (whereas Mn, especially if stacked, are # effectively Mc in vertical layout IF the base is cr/cm/cu/cl). # # rl - Bidi strong RtL character. All bidi R and AL characters (except the # Mongolian and Phags-Pa characters, which should be bidi R) have this # value for the Vertical property (but those are not listed below). # If bidi is done, rotate glyphs with line rotation. If bidi is not # done, rotate glyphs 180 degrees in addition to the line rotation. # Thus these letters then come out upside-down in horizontal lines; # but only if bidi is not done. But in Mongolian line layout, they # will have the same final rotation whether bidi is done or not. # # mo - Mongolian or Phags-pa. While UnicodeData.txt list these as having # bidi property L, they should really be handled as if they had bidi # property R. Thus we will here treat them as having bidi property R. # (The Unicode chart glyphs are (1) isolated forms and (2) positioned # on a vertical line. Rotations given here are relative to horizontal # lines.) After the bidi property correction, treat like rl. # # nv - No vertical. Should not be used in vertical lines. (Treat like 'rt'.) # # cr - Counter-rotate on -90 deg. Handling for CJK ideographs and similar # characters. If (and only if) the line rotation is -90 degrees, # counter-rotate the glyph so that the glyph is still "upright". # Also used for for conjoining Hangul jamo, but the entier syllable # is rotated as a unit, not the individual jamo. The (vertical) # advance "width" is close to the height of the glyph. On the other # hand, the (vertical) advance width for rt and rl characters is # (close to) the (horizontal) width of the unrotated glyph. # # cm - Counter-rotate, and move to (resulting) right side, vertically # centered. Used for Katakana and Hiragana small letters and some # punctuation. INTERPRET AS 'rt' IF THE PRECEEDING CHARACTER IS NOT # A 'cr' OR 'av' NOR AN ACTIVATED 'cm' OR 'cu'. # # cu - Like 'cm', but move to upper right. # # cl - Like 'cm', but move to lower left, and uses a different test. # INTERPRET AS 'rt' IF THE FOLLOWING CHARACTER IS NOT A 'cr' OR # 'av' NOR AN ACTIVATED 'cl'. # # av - Already vertical. Used for characters that should only be used # for vertical lines, including compatibility characters for # parentheses, etc. Treat as 'cr' are treated, i.e. counter-rotate, # but these should not be used for horizontal lines (just because # they may look strange). # # pu - Private Use. To be overridden by any private use assignments. (By # default these may be treated as 'cr', since most often the PUA is # used for CJK characters.) These are not listed below. 0021; cu # EXCLAMATION MARK 0022; cm # QUOTATION MARK 0023..0025; cu # NUMBER SIGN..PERCENT SIGN 0026..0027; cm # AMPERSAND..APOSTROPHE 002C; cu # COMMA 002E; cu # FULL STOP 002F; cm # SOLIDUS 0030..0039; cm # DIGIT ZERO..DIGIT NINE 003A..003F; cu # COLON..QUESTION MARK 0040; cm # COMMERCIAL AT 005C; cm # REVERSE SOLIDUS 00A1; cl # INVERTED EXCLAMATION MARK 00A2..00A9; cu # CENT SIGN..COPYRIGHT SIGN 00AB; cl # LEFT-POINTING DOUBLE ANGLE QUOTATION MARK 00AE; cu # REGISTERED SIGN 00B0; cu # DEGREE SIGN 00BB..00BE; cu # RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK..VULGAR FRACTION THREE QUARTERS 00BF; cl # INVERTED QUESTION MARK 037E; cu # GREEK QUESTION MARK 1100..115F; cr # HANGUL CHOSEONG KIYEOK..HANGUL CHOSEONG FILLER 1160..11A2; cr # HANGUL JUNGSEONG FILLER..HANGUL JUNGSEONG SSANGARAEA 11A8..11F9; cr # HANGUL JONGSEONG KIYEOK..HANGUL JONGSEONG YEORINHIEUH 1800..180A; mo # MONGOLIAN BIRGA..MONGOLIAN NIRUGU 180E; mo # MONGOLIAN VOWEL SEPARATOR 1810..1877; mo # MONGOLIAN DIGIT ZERO..MONGOLIAN LETTER MANCHU ZHA 1880..18A8; mo # MONGOLIAN LETTER ALI GALI ANUSVARA ONE..MONGOLIAN LETTER MANCHU ALI GALI BHA 2018; cl # LEFT SINGLE QUOTATION MARK 2019; cu # RIGHT SINGLE QUOTATION MARK 201A; nv # SINGLE LOW-9 QUOTATION MARK 201B..201C; cl # SINGLE HIGH-REVERSED-9 QUOTATION MARK..LEFT DOUBLE QUOTATION MARK 201D; cu # RIGHT DOUBLE QUOTATION MARK 201E; nv # DOUBLE LOW-9 QUOTATION MARK 201F; cl # DOUBLE HIGH-REVERSED-9 QUOTATION MARK 2030..2031; cu # PER MILLE SIGN..PER TEN THOUSAND SIGN 2039; cl # SINGLE LEFT-POINTING ANGLE QUOTATION MARK 203A; cu # SINGLE RIGHT-POINTING ANGLE QUOTATION MARK 203C..203D; cu # DOUBLE EXCLAMATION MARK..INTERROBANG 2047..2049; cu # DOUBLE QUESTION MARK..EXCLAMATION QUESTION MARK 204F; cl # REVERSED SEMICOLON 205D..205E; cu # TRICOLON..VERTICAL FOUR DOTS 20A0..20B5; cu # EURO-CURRENCY SIGN..CEDI SIGN 2103; cr # DEGREE CELSIUS 2109; cr # DEGREE FAHRENHEIT 2153..215F; cu # VULGAR FRACTION ONE THIRD..FRACTION NUMERATOR ONE 2160..2182; cr # ROMAN NUMERAL ONE..ROMAN NUMERAL TEN THOUSAND 268A..268F; cr # MONOGRAM FOR YANG..DIGRAM FOR GREATER YIN 2E80..2EF3; cr # CJK RADICAL REPEAT..CJK RADICAL C-SIMPLIFIED TURTLE 2F00..2FD5; cr # KANGXI RADICAL ONE..KANGXI RADICAL FLUTE 2FF0..2FFB; cr # IDEOGRAPHIC DESCRIPTION CHARACTER LEFT TO RIGHT..IDEOGRAPHIC DESCRIPTION CHARACTER OVERLAID 3001..3002; cu # IDEOGRAPHIC COMMA..IDEOGRAPHIC FULL STOP 3003..3006; cr # DITTO MARK..DEOGRAPHIC CLOSING MARK 301D; cl # REVERSED DOUBLE PRIME QUOTATION MARK 301E; cu # DOUBLE PRIME QUOTATION MARK 301F; nv # LOW DOUBLE PRIME QUOTATION MARK 3020..3029; cr # POSTAL MARK FACE..HANGZHOU NUMERAL NINE 302A..302F; cr # IDEOGRAPHIC LEVEL TONE MARK..HANGUL DOUBLE DOT TONE MARK 3031..3035; av # VERTICAL KANA REPEAT MARK..VERTICAL KANA REPEAT MARK LOWER HALF 3036..303A; cr # CIRCLED POSTAL MARK..HANGZHOU NUMERAL THIRTY 303B; av # VERTICAL IDEOGRAPHIC ITERATION MARK 303C..303E; cr # MASU MARK..IDEOGRAPHIC VARIATION INDICATOR 3041; cm # HIRAGANA LETTER SMALL A 3042; cr # HIRAGANA LETTER A 3043; cm # HIRAGANA LETTER SMALL I 3044; cr # HIRAGANA LETTER I 3045; cm # HIRAGANA LETTER SMALL U 3046; cr # HIRAGANA LETTER U 3047; cm # HIRAGANA LETTER SMALL E 3048; cr # HIRAGANA LETTER E 3049; cm # HIRAGANA LETTER SMALL O 304A..3062; cr # HIRAGANA LETTER O..HIRAGANA LETTER DI 3063; cm # HIRAGANA LETTER SMALL TU 3064..3082; cr # HIRAGANA LETTER TU..HIRAGANA LETTER MO 3083; cm # HIRAGANA LETTER SMALL YA 3084; cr # HIRAGANA LETTER YA 3085; cm # HIRAGANA LETTER SMALL YU 3086; cr # HIRAGANA LETTER YU 3087; cm # HIRAGANA LETTER SMALL YO 3088..308D; cr # HIRAGANA LETTER YO..HIRAGANA LETTER RO 308E; cm # HIRAGANA LETTER SMALL WA 308F..3094; cr # HIRAGANA LETTER WA..HIRAGANA LETTER VU 3095..3096; cm # HIRAGANA LETTER SMALL KA..HIRAGANA LETTER SMALL KE 3099..309E; cr # COMBINING KATAKANA-HIRAGANA VOICED SOUND MARK..HIRAGANA VOICED ITERATION MARK 309F; av # HIRAGANA DIGRAPH YORI 30A1; cm # KATAKANA LETTER SMALL A 30A2; cr # KATAKANA LETTER A 30A3; cm # KATAKANA LETTER SMALL I 30A4; cr # KATAKANA LETTER I 30A5; cm # KATAKANA LETTER SMALL U 30A6; cr # KATAKANA LETTER U 30A7; cm # KATAKANA LETTER SMALL E 30A8; cr # KATAKANA LETTER E 30A9; cm # KATAKANA LETTER SMALL O 30AA..30C2; cr # KATAKANA LETTER O..KATAKANA LETTER DI 30C3; cm # KATAKANA LETTER SMALL TU 30C4..30E2; cr # KATAKANA LETTER TU..KATAKANA LETTER MO 30E3; cm # KATAKANA LETTER SMALL YA 30E4; cr # KATAKANA LETTER YA 30E5; cm # KATAKANA LETTER SMALL YU 30E6; cr # KATAKANA LETTER YU 30E7; cm # KATAKANA LETTER SMALL YO 30E8..30ED; cr # KATAKANA LETTER YO..KATAKANA LETTER RO 30EE; cm # KATAKANA LETTER SMALL WA 30EF..30F4; cr # KATAKANA LETTER WA..KATAKANA LETTER VU 30F5..30F6; cm # KATAKANA LETTER SMALL KA..KATAKANA LETTER SMALL KE 30F7..30FA; cr # KATAKANA LETTER VA..KATAKANA LETTER VO 30FB..30FC; rt # KATAKANA MIDDLE DOT..KATAKANA-HIRAGANA PROLONGED SOUND MARK 30FD..30FE; cr # KATAKANA ITERATION MARK..KATAKANA VOICED ITERATION MARK 30FF; av # KATAKANA DIGRAPH KOTO 3105..312C; cr # BOPOMOFO LETTER B..BOPOMOFO LETTER GN 3131..318E; cr # HANGUL LETTER KIYEOK..HANGUL LETTER ARAEAE 3190..319F; cr # IDEOGRAPHIC ANNOTATION LINKING MARK..IDEOGRAPHIC ANNOTATION MAN MARK 31A0..31B7; cr # BOPOMOFO LETTER BU..BOPOMOFO FINAL LETTER H 31C0..31CF; cr # CJK STROKE T..CJK STROKE N 31F0..31FF; cm # KATAKANA LETTER SMALL KU..KATAKANA LETTER SMALL RO 3200..321E; cr # PARENTHESIZED HANGUL KIYEOK..PARENTHESIZED KOREAN CHARACTER O HU 3220..3243; cr # PARENTHESIZED IDEOGRAPH ONE..PARENTHESIZED IDEOGRAPH REACH 3250..32FE; cr # PARTNERSHIP SIGN..CIRCLED KATAKANA WO 3300..33FF; cr # SQUARE APAATO..SQUARE GAL 3400..4DB5; cr # CJK UNIFIED IDEOGRAPH-3400..CJK UNIFIED IDEOGRAPH-4DB5 4DC0..4DFF; cr # HEXAGRAM FOR THE CREATIVE HEAVEN..HEXAGRAM FOR BEFORE COMPLETION 4E00..9FBB; cr # CJK UNIFIED IDEOGRAPH-4E00..CJK UNIFIED IDEOGRAPH-9FBB A000..A48C; cr # YI SYLLABLE IT..YI SYLLABLE YYR A490..A4C6; cr # YI RADICAL QOT..YI RADICAL KE A840..A877; mo # PHAGS-PA LETTER KA..PHAGS-PA MARK DOUBLE SHAD AC00..D7A3; cr # HANGUL SYLLABLE GA..HANGUL SYLLABLE HIH F900..FAD9; cr # CJK COMPATIBILITY IDEOGRAPH-F900..CJK COMPATIBILITY IDEOGRAPH-FAD9 FE10..FE19; av # PRESENTATION FORM FOR VERTICAL COMMA..PRESENTATION FORM FOR VERTICAL HORIZONTAL ELLIPSIS FE30..FE44; av # PRESENTATION FORM FOR VERTICAL TWO DOT LEADER..PRESENTATION FORM FOR VERTICAL RIGHT WHITE CORNER BRACKET FE45..FE46; cr # SESAME DOT..WHITE SESAME DOT FE47..FE48; av # PRESENTATION FORM FOR VERTICAL LEFT SQUARE BRACKET..PRESENTATION FORM FOR VERTICAL RIGHT SQUARE BRACKET FE49..FE4F; cr # DASHED OVERLINE..WAVY LOW LINE FE50..FE57; cu # SMALL COMMA..SMALL EXCLAMATION MARK FE58; cm # SMALL EM DASH FE5F..FE61; cm # SMALL NUMBER SIGN..SMALL ASTERISK FE68; cm # SMALL REVERSE SOLIDUS FE6A; cu # SMALL PERCENT SIGN FE6B; cm # SMALL COMMERCIAL AT FF01; cu # FULLWIDTH EXCLAMATION MARK FF02..FF03; cm # FULLWIDTH QUOTATION MARK..FULLWIDTH NUMBER SIGN FF04..FF05; cu # FULLWIDTH DOLLAR SIGN..FULLWIDTH PERCENT SIGN FF06..FF07; cm # FULLWIDTH AMPERSAND..FULLWIDTH APOSTROPHE FF0C; cu # FULLWIDTH COMMA FF0E; cu # FULLWIDTH FULL STOP FF0F; cm # FULLWIDTH SOLIDUS FF10..FF19; cr # FULLWIDTH DIGIT ZERO..FULLWIDTH DIGIT NINE FF1A; cu # FULLWIDTH COLON FF1B; cu # FULLWIDTH SEMICOLON FF1F; cu # FULLWIDTH QUESTION MARK FF20; cm # FULLWIDTH COMMERCIAL AT FF21..FF3A; cr # FULLWIDTH LATIN CAPITAL LETTER A..FULLWIDTH LATIN CAPITAL LETTER Z FF3C; cm # FULLWIDTH REVERSE SOLIDUS FF41..FF5A; cr # FULLWIDTH LATIN SMALL LETTER A..FULLWIDTH LATIN SMALL LETTER Z FF61; cu # HALFWIDTH IDEOGRAPHIC FULL STOP FF64; cu # HALFWIDTH IDEOGRAPHIC COMMA FF66; cr # HALFWIDTH KATAKANA LETTER WO FF67..FF6F; cr # HALFWIDTH KATAKANA LETTER SMALL A..HALFWIDTH KATAKANA LETTER SMALL TU FF71..FF9D; cr # HALFWIDTH KATAKANA LETTER A..HALFWIDTH KATAKANA LETTER N FF9E..FF9F; cr # HALFWIDTH KATAKANA VOICED SOUND MARK..HALFWIDTH KATAKANA SEMI-VOICED SOUND MARK FFA1..FFBE; cr # HALFWIDTH HANGUL LETTER KIYEOK..HALFWIDTH HANGUL LETTER HIEUH FFC2..FFC7; cr # HALFWIDTH HANGUL LETTER A..HALFWIDTH HANGUL LETTER E FFCA..FFCF; cr # HALFWIDTH HANGUL LETTER YEO..HALFWIDTH HANGUL LETTER OE FFD2..FFD7; cr # HALFWIDTH HANGUL LETTER YO..HALFWIDTH HANGUL LETTER YU FFDA..FFDC; cr # HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL LETTER I FFE0..FFE1; cu # FULLWIDTH CENT SIGN..FULLWIDTH POUND SIGN FFE5..FFE6; cu # FULLWIDTH YEN SIGN..FULLWIDTH WON SIGN 1D300..1D356; cr # MONOGRAM FOR EARTH..TETRAGRAM FOR FOSTERING 20000..2A6D6; cr # CJK UNIFIED IDEOGRAPH-20000..CJK UNIFIED IDEOGRAPH-2A6D6 2F800..2FA1D; cr # CJK COMPATIBILITY IDEOGRAPH-2F800..CJK COMPATIBILITY IDEOGRAPH-2FA1D #################################################### # proposed datafile: Vertical.txt # # Vertical lines: Default glyph rotation properties. # # The property value assignments may be overridden by higher-level # protocols. # # Vertical CJK lines are considered to be rotated -90 degrees, and most # non-CJK glyphs are rotated along with that, while most CJK glyphs are # counter-rotated so they stay "upright". Similarly, Mongolian (and # Phags-pa) vertical lines are considered rotated -90 degrees if bidi is # not done (in particular not handling Mongolian as bidi R), but such # lines are considered rotated +90 degrees if bidi is done (in particular # handling Mongolian as bidi R). (The Mongolian glyphs still come out # the same, due to a 180 degree extra rotation if bidi is not done.) # # This property deals with handling of glyphs for characters for # three line layouts (or maybe even four): # # 1. Horizontal lines, next line below, 0 degrees line rotation. # Glyph layout from left to right (after bidi). # # 2. Vertical lines, next line to the left, -90 degrees line rotation. # Vertical CJK and vertical Hangul. Glyph layout from top to bottom. # # 3. Vertical lines, next line to the right, +90 degrees line rotation # if bidi is done, -90 degrees line rotation if bidi is not done. # Mongolian and Phags-Pa (but also vertical variant of horizontal # scripts, if no bidi done and rt (or rl) interpreted as cr). Glyph # layout top to bottom for -90 degrees, and from bottom to top for # +90 degrees (after bidi). # # And maybe also (4.) vertical lines where the letters are still uppright; # case 3 with no bidi plus interpreting rt or rl as cr; see property value # descriptions below. An Indic subscript or adscript consonant is to be # treated as a combining character (ih). An Indic consonant conjuct ligature # or a Hangul syllable (also when constructed from Jamos) is to be treated # as a base character (rt for Indic, cr for Hangul). If rt, rl, or mo is # interpreted as cr (for -90 degrees line rotation, no bidi, upright glyphs), # that may break any cursive joining, Mn are effectively Mc and Mc are # effectively Mn. # # Other, more unusual, line layouts are not covered. # # Some fonts may include the effects of these properties directly. E.g., # some CJK fonts incorporate glyph counter-rotations (cr, av) and other # placement changes (cm, cl, cu) for -90 degree line rotatation. A layout # engine needs to take into account both the glyph rotation property for # line rotations as well as the font's already incorporated glyph rotation # when rotating glyphs to be layed out on a line. If the font has # precounterrotated (+90 degrees) glyphs (for cr/cm/cu/cl), first nominally # rotate the prerotated glyphs -90 degrees. cm/cu/cl will (nominally) need # additional adjustments. If the line rotation is -90 degrees the net # glyph rotation is then 0 degrees and the additional adjustments should # be cancelled out. Then just taking the vertical line targetted glyphs # as they are is both more efficient and may lead to a better result than # actually going through the motions. If the font does not have precounter- # rotated glyphs and the line layout is vertical or it only has precounter- # rotated glyphs and the line layout is horizontal, then glyph rotation # actually needs to be computed (and, among other things, done so that one # avoids overlapping glyphs in the display of the text). # # The Vertical property values and their interpretations are as follows: # # rt - Rotate with the line rotation. Except for 'rl', 'ih', and 'pu', # 'rt' is the vertical property for all characters not listed below. # # ih - Inherited. All combining characters (Mn, Me, Mc) have this values # for the Vertical property. Their vertical behaviour is inherited # from the base character it is applied to, but the glyph stays # positioned with the base character (not on their own). These are # NOT listed below. Note that Indic vowels that are Mc (spacing # combining) are effectively non-spacing in vertical layout IF the # base is cr/cm/cu/cl (whereas Mn, especially if stacked, are # effectively Mc in vertical layout IF the base is cr/cm/cu/cl). # # rl - Bidi strong RtL character. All bidi R and AL characters (except the # Mongolian and Phags-Pa characters, which should be bidi R) have this # value for the Vertical property (but those are not listed below). # If bidi is done, rotate glyphs with line rotation. If bidi is not # done, rotate glyphs 180 degrees in addition to the line rotation. # Thus these letters then come out upside-down in horizontal lines; # but only if bidi is not done. But in Mongolian line layout, they # will have the same final rotation whether bidi is done or not. # # mo - Mongolian or Phags-pa. While UnicodeData.txt list these as having # bidi property L, they should really be handled as if they had bidi # property R. Thus we will here treat them as having bidi property R. # (The Unicode chart glyphs are (1) isolated forms and (2) positioned # on a vertical line. Rotations given here are relative to horizontal # lines.) After the bidi property correction, treat like rl. # # nv - No vertical. Should not be used in vertical lines. (Treat like 'rt'.) # # cr - Counter-rotate on -90 deg. Handling for CJK ideographs and similar # characters. If (and only if) the line rotation is -90 degrees, # counter-rotate the glyph so that the glyph is still "upright". # Also used for for conjoining Hangul jamo, but the entier syllable # is rotated as a unit, not the individual jamo. The (vertical) # advance "width" is close to the height of the glyph. On the other # hand, the (vertical) advance width for rt and rl characters is # (close to) the (horizontal) width of the unrotated glyph. # # cm - Counter-rotate, and move to (resulting) right side, vertically # centered. Used for Katakana and Hiragana small letters and some # punctuation. INTERPRET AS 'rt' IF THE PRECEEDING CHARACTER IS NOT # A 'cr' OR 'av' NOR AN ACTIVATED 'cm' OR 'cu'. # # cu - Like 'cm', but move to upper right. # # cl - Like 'cm', but move to lower left, and uses a different test. # INTERPRET AS 'rt' IF THE FOLLOWING CHARACTER IS NOT A 'cr' OR # 'av' NOR AN ACTIVATED 'cl'. # # av - Already vertical. Used for characters that should only be used # for vertical lines, including compatibility characters for # parentheses, etc. Treat as 'cr' are treated, i.e. counter-rotate, # but these should not be used for horizontal lines (just because # they may look strange). # # pu - Private Use. To be overridden by any private use assignments. (By # default these may be treated as 'cr', since most often the PUA is # used for CJK characters.) These are not listed below. 0021; cu # EXCLAMATION MARK 0022; cm # QUOTATION MARK 0023..0025; cu # NUMBER SIGN..PERCENT SIGN 0026..0027; cm # AMPERSAND..APOSTROPHE 002C; cu # COMMA 002E; cu # FULL STOP 002F; cm # SOLIDUS 0030..0039; cm # DIGIT ZERO..DIGIT NINE 003A..003F; cu # COLON..QUESTION MARK 0040; cm # COMMERCIAL AT 005C; cm # REVERSE SOLIDUS 00A1; cl # INVERTED EXCLAMATION MARK 00A2..00A9; cu # CENT SIGN..COPYRIGHT SIGN 00AB; cl # LEFT-POINTING DOUBLE ANGLE QUOTATION MARK 00AE; cu # REGISTERED SIGN 00B0; cu # DEGREE SIGN 00BB..00BE; cu # RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK..VULGAR FRACTION THREE QUARTERS 00BF; cl # INVERTED QUESTION MARK 037E; cu # GREEK QUESTION MARK 1100..115F; cr # HANGUL CHOSEONG KIYEOK..HANGUL CHOSEONG FILLER 1160..11A2; cr # HANGUL JUNGSEONG FILLER..HANGUL JUNGSEONG SSANGARAEA 11A8..11F9; cr # HANGUL JONGSEONG KIYEOK..HANGUL JONGSEONG YEORINHIEUH 1800..180A; mo # MONGOLIAN BIRGA..MONGOLIAN NIRUGU 180E; mo # MONGOLIAN VOWEL SEPARATOR 1810..1877; mo # MONGOLIAN DIGIT ZERO..MONGOLIAN LETTER MANCHU ZHA 1880..18A8; mo # MONGOLIAN LETTER ALI GALI ANUSVARA ONE..MONGOLIAN LETTER MANCHU ALI GALI BHA 2018; cl # LEFT SINGLE QUOTATION MARK 2019; cu # RIGHT SINGLE QUOTATION MARK 201A; nv # SINGLE LOW-9 QUOTATION MARK 201B..201C; cl # SINGLE HIGH-REVERSED-9 QUOTATION MARK..LEFT DOUBLE QUOTATION MARK 201D; cu # RIGHT DOUBLE QUOTATION MARK 201E; nv # DOUBLE LOW-9 QUOTATION MARK 201F; cl # DOUBLE HIGH-REVERSED-9 QUOTATION MARK 2030..2031; cu # PER MILLE SIGN..PER TEN THOUSAND SIGN 2039; cl # SINGLE LEFT-POINTING ANGLE QUOTATION MARK 203A; cu # SINGLE RIGHT-POINTING ANGLE QUOTATION MARK 203C..203D; cu # DOUBLE EXCLAMATION MARK..INTERROBANG 2047..2049; cu # DOUBLE QUESTION MARK..EXCLAMATION QUESTION MARK 204F; cl # REVERSED SEMICOLON 205D..205E; cu # TRICOLON..VERTICAL FOUR DOTS 20A0..20B5; cu # EURO-CURRENCY SIGN..CEDI SIGN 2103; cr # DEGREE CELSIUS 2109; cr # DEGREE FAHRENHEIT 2153..215F; cu # VULGAR FRACTION ONE THIRD..FRACTION NUMERATOR ONE 2160..2182; cr # ROMAN NUMERAL ONE..ROMAN NUMERAL TEN THOUSAND 268A..268F; cr # MONOGRAM FOR YANG..DIGRAM FOR GREATER YIN 2E80..2EF3; cr # CJK RADICAL REPEAT..CJK RADICAL C-SIMPLIFIED TURTLE 2F00..2FD5; cr # KANGXI RADICAL ONE..KANGXI RADICAL FLUTE 2FF0..2FFB; cr # IDEOGRAPHIC DESCRIPTION CHARACTER LEFT TO RIGHT..IDEOGRAPHIC DESCRIPTION CHARACTER OVERLAID 3001..3002; cu # IDEOGRAPHIC COMMA..IDEOGRAPHIC FULL STOP 3003..3006; cr # DITTO MARK..DEOGRAPHIC CLOSING MARK 301D; cl # REVERSED DOUBLE PRIME QUOTATION MARK 301E; cu # DOUBLE PRIME QUOTATION MARK 301F; nv # LOW DOUBLE PRIME QUOTATION MARK 3020..3029; cr # POSTAL MARK FACE..HANGZHOU NUMERAL NINE 302A..302F; cr # IDEOGRAPHIC LEVEL TONE MARK..HANGUL DOUBLE DOT TONE MARK 3031..3035; av # VERTICAL KANA REPEAT MARK..VERTICAL KANA REPEAT MARK LOWER HALF 3036..303A; cr # CIRCLED POSTAL MARK..HANGZHOU NUMERAL THIRTY 303B; av # VERTICAL IDEOGRAPHIC ITERATION MARK 303C..303E; cr # MASU MARK..IDEOGRAPHIC VARIATION INDICATOR 3041; cm # HIRAGANA LETTER SMALL A 3042; cr # HIRAGANA LETTER A 3043; cm # HIRAGANA LETTER SMALL I 3044; cr # HIRAGANA LETTER I 3045; cm # HIRAGANA LETTER SMALL U 3046; cr # HIRAGANA LETTER U 3047; cm # HIRAGANA LETTER SMALL E 3048; cr # HIRAGANA LETTER E 3049; cm # HIRAGANA LETTER SMALL O 304A..3062; cr # HIRAGANA LETTER O..HIRAGANA LETTER DI 3063; cm # HIRAGANA LETTER SMALL TU 3064..3082; cr # HIRAGANA LETTER TU..HIRAGANA LETTER MO 3083; cm # HIRAGANA LETTER SMALL YA 3084; cr # HIRAGANA LETTER YA 3085; cm # HIRAGANA LETTER SMALL YU 3086; cr # HIRAGANA LETTER YU 3087; cm # HIRAGANA LETTER SMALL YO 3088..308D; cr # HIRAGANA LETTER YO..HIRAGANA LETTER RO 308E; cm # HIRAGANA LETTER SMALL WA 308F..3094; cr # HIRAGANA LETTER WA..HIRAGANA LETTER VU 3095..3096; cm # HIRAGANA LETTER SMALL KA..HIRAGANA LETTER SMALL KE 3099..309E; cr # COMBINING KATAKANA-HIRAGANA VOICED SOUND MARK..HIRAGANA VOICED ITERATION MARK 309F; av # HIRAGANA DIGRAPH YORI 30A1; cm # KATAKANA LETTER SMALL A 30A2; cr # KATAKANA LETTER A 30A3; cm # KATAKANA LETTER SMALL I 30A4; cr # KATAKANA LETTER I 30A5; cm # KATAKANA LETTER SMALL U 30A6; cr # KATAKANA LETTER U 30A7; cm # KATAKANA LETTER SMALL E 30A8; cr # KATAKANA LETTER E 30A9; cm # KATAKANA LETTER SMALL O 30AA..30C2; cr # KATAKANA LETTER O..KATAKANA LETTER DI 30C3; cm # KATAKANA LETTER SMALL TU 30C4..30E2; cr # KATAKANA LETTER TU..KATAKANA LETTER MO 30E3; cm # KATAKANA LETTER SMALL YA 30E4; cr # KATAKANA LETTER YA 30E5; cm # KATAKANA LETTER SMALL YU 30E6; cr # KATAKANA LETTER YU 30E7; cm # KATAKANA LETTER SMALL YO 30E8..30ED; cr # KATAKANA LETTER YO..KATAKANA LETTER RO 30EE; cm # KATAKANA LETTER SMALL WA 30EF..30F4; cr # KATAKANA LETTER WA..KATAKANA LETTER VU 30F5..30F6; cm # KATAKANA LETTER SMALL KA..KATAKANA LETTER SMALL KE 30F7..30FA; cr # KATAKANA LETTER VA..KATAKANA LETTER VO 30FB..30FC; rt # KATAKANA MIDDLE DOT..KATAKANA-HIRAGANA PROLONGED SOUND MARK 30FD..30FE; cr # KATAKANA ITERATION MARK..KATAKANA VOICED ITERATION MARK 30FF; av # KATAKANA DIGRAPH KOTO 3105..312C; cr # BOPOMOFO LETTER B..BOPOMOFO LETTER GN 3131..318E; cr # HANGUL LETTER KIYEOK..HANGUL LETTER ARAEAE 3190..319F; cr # IDEOGRAPHIC ANNOTATION LINKING MARK..IDEOGRAPHIC ANNOTATION MAN MARK 31A0..31B7; cr # BOPOMOFO LETTER BU..BOPOMOFO FINAL LETTER H 31C0..31CF; cr # CJK STROKE T..CJK STROKE N 31F0..31FF; cm # KATAKANA LETTER SMALL KU..KATAKANA LETTER SMALL RO 3200..321E; cr # PARENTHESIZED HANGUL KIYEOK..PARENTHESIZED KOREAN CHARACTER O HU 3220..3243; cr # PARENTHESIZED IDEOGRAPH ONE..PARENTHESIZED IDEOGRAPH REACH 3250..32FE; cr # PARTNERSHIP SIGN..CIRCLED KATAKANA WO 3300..33FF; cr # SQUARE APAATO..SQUARE GAL 3400..4DB5; cr # CJK UNIFIED IDEOGRAPH-3400..CJK UNIFIED IDEOGRAPH-4DB5 4DC0..4DFF; cr # HEXAGRAM FOR THE CREATIVE HEAVEN..HEXAGRAM FOR BEFORE COMPLETION 4E00..9FBB; cr # CJK UNIFIED IDEOGRAPH-4E00..CJK UNIFIED IDEOGRAPH-9FBB A000..A48C; cr # YI SYLLABLE IT..YI SYLLABLE YYR A490..A4C6; cr # YI RADICAL QOT..YI RADICAL KE A840..A877; mo # PHAGS-PA LETTER KA..PHAGS-PA MARK DOUBLE SHAD AC00..D7A3; cr # HANGUL SYLLABLE GA..HANGUL SYLLABLE HIH F900..FAD9; cr # CJK COMPATIBILITY IDEOGRAPH-F900..CJK COMPATIBILITY IDEOGRAPH-FAD9 FE10..FE19; av # PRESENTATION FORM FOR VERTICAL COMMA..PRESENTATION FORM FOR VERTICAL HORIZONTAL ELLIPSIS FE30..FE44; av # PRESENTATION FORM FOR VERTICAL TWO DOT LEADER..PRESENTATION FORM FOR VERTICAL RIGHT WHITE CORNER BRACKET FE45..FE46; cr # SESAME DOT..WHITE SESAME DOT FE47..FE48; av # PRESENTATION FORM FOR VERTICAL LEFT SQUARE BRACKET..PRESENTATION FORM FOR VERTICAL RIGHT SQUARE BRACKET FE49..FE4F; cr # DASHED OVERLINE..WAVY LOW LINE FE50..FE57; cu # SMALL COMMA..SMALL EXCLAMATION MARK FE58; cm # SMALL EM DASH FE5F..FE61; cm # SMALL NUMBER SIGN..SMALL ASTERISK FE68; cm # SMALL REVERSE SOLIDUS FE6A; cu # SMALL PERCENT SIGN FE6B; cm # SMALL COMMERCIAL AT FF01; cu # FULLWIDTH EXCLAMATION MARK FF02..FF03; cm # FULLWIDTH QUOTATION MARK..FULLWIDTH NUMBER SIGN FF04..FF05; cu # FULLWIDTH DOLLAR SIGN..FULLWIDTH PERCENT SIGN FF06..FF07; cm # FULLWIDTH AMPERSAND..FULLWIDTH APOSTROPHE FF0C; cu # FULLWIDTH COMMA FF0E; cu # FULLWIDTH FULL STOP FF0F; cm # FULLWIDTH SOLIDUS FF10..FF19; cr # FULLWIDTH DIGIT ZERO..FULLWIDTH DIGIT NINE FF1A; cu # FULLWIDTH COLON FF1B; cu # FULLWIDTH SEMICOLON FF1F; cu # FULLWIDTH QUESTION MARK FF20; cm # FULLWIDTH COMMERCIAL AT FF21..FF3A; cr # FULLWIDTH LATIN CAPITAL LETTER A..FULLWIDTH LATIN CAPITAL LETTER Z FF3C; cm # FULLWIDTH REVERSE SOLIDUS FF41..FF5A; cr # FULLWIDTH LATIN SMALL LETTER A..FULLWIDTH LATIN SMALL LETTER Z FF61; cu # HALFWIDTH IDEOGRAPHIC FULL STOP FF64; cu # HALFWIDTH IDEOGRAPHIC COMMA FF66; cr # HALFWIDTH KATAKANA LETTER WO FF67..FF6F; cr # HALFWIDTH KATAKANA LETTER SMALL A..HALFWIDTH KATAKANA LETTER SMALL TU FF71..FF9D; cr # HALFWIDTH KATAKANA LETTER A..HALFWIDTH KATAKANA LETTER N FF9E..FF9F; cr # HALFWIDTH KATAKANA VOICED SOUND MARK..HALFWIDTH KATAKANA SEMI-VOICED SOUND MARK FFA1..FFBE; cr # HALFWIDTH HANGUL LETTER KIYEOK..HALFWIDTH HANGUL LETTER HIEUH FFC2..FFC7; cr # HALFWIDTH HANGUL LETTER A..HALFWIDTH HANGUL LETTER E FFCA..FFCF; cr # HALFWIDTH HANGUL LETTER YEO..HALFWIDTH HANGUL LETTER OE FFD2..FFD7; cr # HALFWIDTH HANGUL LETTER YO..HALFWIDTH HANGUL LETTER YU FFDA..FFDC; cr # HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL LETTER I FFE0..FFE1; cu # FULLWIDTH CENT SIGN..FULLWIDTH POUND SIGN FFE5..FFE6; cu # FULLWIDTH YEN SIGN..FULLWIDTH WON SIGN 1D300..1D356; cr # MONOGRAM FOR EARTH..TETRAGRAM FOR FOSTERING 20000..2A6D6; cr # CJK UNIFIED IDEOGRAPH-20000..CJK UNIFIED IDEOGRAPH-2A6D6 2F800..2FA1D; cr # CJK COMPATIBILITY IDEOGRAPH-2F800..CJK COMPATIBILITY IDEOGRAPH-2FA1D ####################################################