Basic line Breaking rules for Tibetan, Dzongkha & Ladakhi

Date: 2005-02-14    Rev 3            L2/05-073
 

This document presents a draft of basic line Breaking rules for Tibetan, Dzongkha & Ladakhi as written up by Christopher Fynn in e-mail discussions with Mark Davis and myself. He notes that in the following, if possible the different rules should be weighted: 1 & 2 are preferable to 3 and 4 gives two cases of unequal strength. Such differentiation properly belongs in a layout system but would be absent from regular break analysis.

The flow of the text is still pretty rough and shows the tell-tale signs of having been assembled from an e-mail interchange.

The proposed changes to the line break classes for Unicode 4.1 are carried out in Appendix A.


1 Character classes

It is useful to assign some character classes in the style of UAX#29 Text Boundaries. Some of the sets defined here merely collect characters of similar overall behavior in the Tibetan context. The do not represent the minimal number of classes for line breaking or text boundary determination. In other words, several of these classes may be combined in an actual specification and/or be subsumed in existing classes, such as NU for digits, for example.

tibetan_consonant = [U+0F40, U+0F41, U+0F43 - U+0F67] // base consonants
|
tibetan_tsheg  = U+0F0B  // primary syllable & word delimiter
tibetan_impliedtsheg = [U+0F7F,U+0F85] // tsheg is implied
tibetan_nobreaktsheg = U+0F0C // use when tsheg is supposed to be non-breaking

tibetan_shad = [U+0F0D - U+0F11, U+0F14] // phrase delimiters
tibetan_impliedshad = [U+0F40, U+0F42] // shad is implied

tibetan_head_letter = [U+0F01 - U+0F04, U+0F06, U+0F07, U+0F09, U+0F0A, U+0FD0, U+0FD1] // head letters
tibetan_etcetera =  [U+0F34, U+0FBF, U+0FBE]

The following chars should be ignored where they appear in a string for the purpose of determining  a break opportunity:

tibetan_ignorables = [U+0F18, U+0F19, U+0F35, U+0F37, U+0F39, U+0F3E, U+0F3F, U+0F71 - U+0F7E,
                                U+0F80 - U+0F84, U+0F86 - U+0F87, U+0F90 - U+0FBC, U+0FC6]

Note: U+0F35 and U+0F37 are used as highlighting or emphasis marks -  somewhat like underlining (or <em> </em>) in Latin text. U+0F18, U+0F19, U+0F3E & U+0F3F are properly used only with digits (in almanacs). They should not occur with letters.

U+0F12 and U+0F08 are used at the start of new  sections or topics. In traditional orthography these should not occur at the beginning of a line - but in  some modern Tibetan text they do. Probably best not  to allow break before these unless preceded by a hard  return or new line char.

tibetan_startsegment = [U+0F08, U+0F12]  

These are the remaining characters, they mostly seem to be diverse marks:

tibetan_markother = [U+0F00-U+0F05, U+0F36, U+0F38, U+0F88-U+0F8B,  U+0FC0-U+0FC5,U+0FC7-U+0FCC] // regular symbols
tibetan_open [U+0F3A, U+0F3C] // opening paired punctuation
tibetan_close [U+0F3B, U+0F3D] // closing paired punctuation


Numbers:

tibetan_digits = [U+0F20-U+0F29]   // decimal digits
tibetan_digitsminushalf = [U+0F2A-U+0F33]

The latter are *extremly* rare - Chris Fynn writes: "in over thirty years of reading Tibetan documents the only one I have ever seen in actual use is U+0f3a (Tibetan one-half) and that in less than a dozen documents."

1a Proposed Line Break Properties

Based on the analysis and rules below, these character classes are assigned the following line break properties

LB Property Tibetan character classes from Section 1
AL tibetan_consonant, tibetan_digithalfminus, tibean_markother, , tibetan_impliedshad
GL tibetan_startsegment, tibetan_nobreaktsheg
CM tibetan_ignorables, tibetan_combiningfornumbers
NU tibetan_digits
BA tibetan_tsheg, tibetan_impliedtsheg,  tibetan_etcetera
EX tibetan_shad
BB tibetan_headletter
OP tibetan_open
CL tibetan_close


2 Line Breaking Rules

Rule 1 Primary Opportunity

In Tibetan, the primary break opportunity is after words:

   tibetan_consonant tibetan_tsheg @  tibetan_consonant

 There should *no* break following

    tibetan_tsheg  tibetan_shad

U+0F0C was encoded to replace U+0F0B in those instances)

Rule 1b Implied Tsheg

Where U+0F7F occurs, neither a tsheg (U+0F0B) or a shad (\0F0D or \0F0E) is ever written after - though one of these characters is *always* implied since this character only occurs at the end of a syllable.  Therefore, if U+0F7F is immediately followed by a consonant (U+0F40...U+0F6A) it can be treated as tsheg (ie a break opportunity exists immediately after); however where it is followed by a space it should have the same breaking properties as  \0F0D or \0F0E  when followed by a space.

              tibetan_consonant U+0F7F @ tibetan_consonant

Although it is technically an accent used in Sanskrit, in some texts the character U+0F7F is used like U+0F14 (a kind of shad). The case of 0F7F is an interesting anomaly where a combining mark has a line break function. This can, of course be approximated by giving that character a specific line break class other than 'CM', but it would make it generally not possible for software to perform cluster analysis up-front, at least not w/o some exceptional logic for this case.

0F85 TIBETAN MARK PALUTA     
This character should probably be treated the same as U+0F7F (i.e. a tsheg should be implied where this character is followed by a consonant character and a shad implied  where this character followed by a whitespace.)

Rule 2 Another primary break opportunity

 another primary break opportunity is betweeen phrases (before the start of the first word in a new phrase)

   tibetan_consonant tibetan_shad [:whitespace] tibetan_shad @ tibetan_consonant

  tibetan_consonant tibetan_shad [:whitespace] tibetan_shad  @ tibetan_consonant

   tibetan_consonant  tibetan_etcetera [:whitespace] tibetan_shad @ tibetan_consonant

A shad is not written after U+0F40 or U+0F42, though when followed by whitespace, one is implied:

[/u0F40, /u0F42] [:whitespace] tibetan_shad  @ tibetan_consonant

This is because these characters have a long stem on the right side similar in appearance  to a shad.

The same thing but with no opening shad:

   tibetan_consonant tibetan_consonant  tibetan_shad [:whitespace] @ tibetan_consonant

   tibetan_consonant tibetan_shad [:whitespace] @ tibetan_consonant

   tibetan_consonant tibetan_etcetera [:whitespace ] @ tibetan_consonant

   [/u0F40, /u0F42] [:whitespace] @ tibetan_consonant

Rules 3. secondary break opportunity before head letters


tibetan_consonant [tibetan_shad, tibetan_etcetera, U+0F7F] [:whitespace] @ tibetan_head_letter

[U+0F40, U+0F42]  [:whitespace] @ tibetan_head_letter

Rule 4 Digits

Characters of class tibetan_digit form numbers with no break between two consecutive digits.
There should probably be a strong break opportunity after a number a weak opportunity before a number.

CF: If you have a situation:
     [tibetan_consonant][tibetan digits][tibetan_consonant]
or:
     [tibetan_consonant]U+0f0b[tibetan digits][tibetan_consonant]

It is best to break *after* the digits   - as  numbers in-line with text are most frequently connected with text immediately preceding the digits - not the text after

AF: My working hypothesis is that
         @ tibetan_consonant
is equivalent to
         @ tibetan_consonant
   or
         @ tibetan_digit
   or
         @ tibetan_digitminushalf
I am further assuming that no breaks are allowed around tibetan_markother.
The notes already address tibetan_startsegment and for simplicity I will assume that the same is true for tibetan_startother.

Rule 5: Tibetan Etcetera / ellipsis characters

A break is also allowed after U+0F34 and U+0FBE. U+0FBF is a little different as it is often written doubled or tripled  - in which case a break should really only occur after the last U+0FBF in the sequence.

AF: Note that Line break class BA prevents a break before itself

 

3 Analysis in terms of the Line Break algorithm

AF: The rules stated above use a context length that's longer than can be expressed in many typical implementations of linebreaking. In the Unicode Line break algorithm, the context usually is

B  A
or
B [:whitespace] A

Here we are using using the convention from UAX#14 to use B for 'before' and A for 'after'. It's therefore reasonable to see whether it isn't possible to express the same rules using more limited context. I proposed that for the tsheg we simply write
        tsheg @ consonant
or even
        tsheg @ any

CF: OK

after observing first any rules that unconditionally prohibit a break before any particular characters. In all other instances, we rely on the user to use nobreak-tsheg where needed.

CF: OK (- though a "smart" IME might do this too)
For the shads/etceteras it seems that
        shad/etcetera [:whitespace] shad/etc
is always kept together. The second rule seems to be that
        shad/etcetera, U+0F40, U+0F42 [:whitespace] รท consonant
allows a break before the consonant.
CF::yes

           Third

        shad/etcetera,U+0F7F,U+0F40, U+0F42 [:whitespace] @ tibetan_head_letter

and finally no break in

        [U+0F40, U+0F42] [:whitespace]  shad/etc

Together these would seem to give a reasonable approximation of your longer rules, different essentially only when the whole expression isn't starter by a consonant on the left, but by something else.

As commented above, where I write 'consonant' in the rewritten rules, I assume that digits also qualify, and would like you to tell me which, if any, of the others marks affect the break opportunities, and if so, how.

CF: You could also allow breaks after the symbols U+0FC0..U+0FC5 and U+0FC7..U+0FCC

CF: I'm not certain what rules can be applied to U+0F1A..U+0F1F and U+0FCF
but it is probably best to have them as non-breaking. These characters usually occur in formatted tables not running text and it seems safest to leave it up to users to manually insert a break after these characters if they need one.

For those familiar with line break classes in UAX#14, it would seem to me that the tsheg could be LB class BA and the shad/etc and similar characters could be EX, as there is no break between EX SPACE EX.
The case of 0F7F would need to be investigated.

CF: U+0F7F @ consonant/digit is OK

Digits and consonants would remain NU and AL, and the half-minus digits could be either NU or AL, I think they are AL now.
Things like 0F08 that should allow no break before or after could become GL. That's a bit strong, but would serve the purpose.

The otherwise non-specified marks would have to be sorted into classes - but absent some more information about their behavior, that's not possible at this time. Currently, they are assigned AL and therefore behave like Consonants.

4 Additional Notes

AF: your notes and descriptions might make a good section to be added in UAX#14

CF: OK if you want to include them in some form.

Notes:

2. For proper line formatting:

 i. Traditionally the ends of lines are padded up to the  right margin by displaying three or more of glyphs for
    U+0F0B at the end of the line (except where here is a  new line char).

ii. Where, due to line wrapping, a shad [U+0F0D or U+0F0E]  appears at the end of a single word starting a new line
    [e.g. <NL> word *shad* whitespace shad more-words or: <NL> word *shad* whitespace more-words]
    that *shad* should be displayed using glyph for U+0F11


=========================================
U+0F0B is the nearest thing in Tibetan to a space in English actual spaces in Tibetan should usually be treated as NBSPACE.
Since U+0F0B is very frequent in Tibetan (occurring after every 1 to 4 base characters) there should be ample line break opportunities.

AF: A note suggest to treat SPACE as NBSP in Tibetan. As NBSP is available and the use of SPACE is rare, I thing we should discourage that practice. It needlessly complicates the algorithm.
CF: OK. In the Bhutanese keyboard layout tsheg is assigned to the spacebar nbspace to shift-spacebar, U+0F0C to alt-spacebar and space to altgr-spacebar - so it is much easier to type NBSPACE than it is to type SPACE. If other Tibetan script input methods follow a similar practice U+0020 will be rare as it should be in Tibetan text.

The only time U+0F0B does not provide a break opportunity is when immediately followed by U+0F0D or U+0F0F (this should happen only after the consonant U+0F44 as U+0F0D or U+0F0F should not be written after this letter without an intervening tsheg - with all other consonants no intervening tsheg is written bfore shad.)

Tibetan punctuation other than U+0F0B does *not* provide an automatic break opportunity except as in rules above.

There is no punctuation equivalent to a period in Tibetan, tibetan_shad characters indicate the end of a "phrase"
not a sentence. "Phrases" are often metrical (written after every N syllables) and a new sentence can often start within
the middle of a "phrase". Sentence boundaries need to be determined grammatically rather than by punctuation.
(anyway don't affect line breaking)

Traditionally new chapters, new sections,  new topics run right on from the previous text (continuing on the same line) and the only indication is a mark like U+0F06, U+0F12 or U+0F08; or some words like "here ends the first chapter about such-and-such second chapter...". This means you can have many pages of Tibetan text - even an entire book - without a hard return character!

Traditionally new sections should *not* start at the beginning of a new
line. However some modern books, newspapers, magazines format text more like English with a break before each section or topic - and (often) the title of the section on a separate line. Where this is done, people do insert a hard return or new paragraph - so there should be need to worry about this for automatic line breaking
Western punctuation (full stop, question mark, exclamation mark, comma, colon, semi colon,  quotes) is starting to appear in Tibetan documents, particularly those published in India, Bhutan & Nepal. It is probably OK to treat these as in English since there are no formal rules for their use in Tibetan as yet.
In Tibetan documents published in PRC China, CJK bracket and punctuation characters occur quite frequently - again these should probably be treated as in Chinese (except they should be treated as left to right characters in Tibetan).
For word selection you can also use U+0F0B, U+0F0D etc. In Tibetan every syllable has a lexical meaning - while there are many compound (multi-syllable) words, there is no easy way of differentiating between words and syllables - and, except for a handful of grammatical particles, whether a syllable forms a compound with the syllable that precedes or follows it.  If you wanted to find boundaries of most compound words  you'd need both a dictionary and a grammatical parser.

CF. there are also rules for formatting at the start of new pages, formatting page numbers, nested & enumerated lists, nested
sections & topics and so on.  If you also need information on these,
let me know.

CF Traditionally there is nothing akin to a paragraph in Tibetan text. Situations where you have many pages of text without a paragraph break  (/ hard return)  can be common in Tibetan. The closest thing to a paragraph in Tibetan is a new section or topic starting with \u0F12 or \u0F18 but these occur in-line (i.e. one section ends and a new one starts on the same line and the new section is marked only by the prescence of one of these characters).

Appendix A: Tibetan Script in Linebreak.txt

After applying the proposed line break classes, the section of LineBreak.txt for Tibetan will look like this:

0F00;AL # TIBETAN SYLLABLE OM
0F01;BB # TIBETAN MARK GTER YIG MGO TRUNCATED A
0F02;BB # TIBETAN MARK GTER YIG MGO -UM RNAM BCAD MA
0F03;BB # TIBETAN MARK GTER YIG MGO -UM GTER TSHEG MA
0F04;BB # TIBETAN MARK INITIAL YIG MGO MDUN MA
0F05;AL # TIBETAN MARK CLOSING YIG MGO SGAB MA
0F06;BB # TIBETAN MARK CARET YIG MGO PHUR SHAD MA
0F07;BB # TIBETAN MARK YIG MGO TSHEG SHAD MA
0F08;GL # TIBETAN MARK SBRUL SHAD
0F09;BB # TIBETAN MARK BSKUR YIG MGO
0F0A;BB # TIBETAN MARK BKA- SHOG YIG MGO
0F0B;BA # TIBETAN MARK INTERSYLLABIC TSHEG
0F0C;GL # TIBETAN MARK DELIMITER TSHEG BSTAR
0F0D;EX # TIBETAN MARK SHAD
0F0E;EX # TIBETAN MARK NYIS SHAD
0F0F;EX # TIBETAN MARK TSHEG SHAD
0F10;EX # TIBETAN MARK NYIS TSHEG SHAD
0F11;EX # TIBETAN MARK RIN CHEN SPUNGS SHAD
0F12;GL # TIBETAN MARK RGYA GRAM SHAD
0F13;AL # TIBETAN MARK CARET -DZUD RTAGS ME LONG CAN
0F14;EX # TIBETAN MARK GTER TSHEG
0F15;AL # TIBETAN LOGOTYPE SIGN CHAD RTAGS
0F16;AL # TIBETAN LOGOTYPE SIGN LHAG RTAGS
0F17;AL # TIBETAN ASTROLOGICAL SIGN SGRA GCAN -CHAR RTAGS
0F18;CM # TIBETAN ASTROLOGICAL SIGN -KHYUD PA
0F19;CM # TIBETAN ASTROLOGICAL SIGN SDONG TSHUGS
0F1A;AL # TIBETAN SIGN RDEL DKAR GCIG
0F1B;AL # TIBETAN SIGN RDEL DKAR GNYIS
0F1C;AL # TIBETAN SIGN RDEL DKAR GSUM
0F1D;AL # TIBETAN SIGN RDEL NAG GCIG
0F1E;AL # TIBETAN SIGN RDEL NAG GNYIS
0F1F;AL # TIBETAN SIGN RDEL DKAR RDEL NAG
0F20;NU # TIBETAN DIGIT ZERO
0F21;NU # TIBETAN DIGIT ONE
0F22;NU # TIBETAN DIGIT TWO
0F23;NU # TIBETAN DIGIT THREE
0F24;NU # TIBETAN DIGIT FOUR
0F25;NU # TIBETAN DIGIT FIVE
0F26;NU # TIBETAN DIGIT SIX
0F27;NU # TIBETAN DIGIT SEVEN
0F28;NU # TIBETAN DIGIT EIGHT
0F29;NU # TIBETAN DIGIT NINE
0F2A;AL # TIBETAN DIGIT HALF ONE
0F2B;AL # TIBETAN DIGIT HALF TWO
0F2C;AL # TIBETAN DIGIT HALF THREE
0F2D;AL # TIBETAN DIGIT HALF FOUR
0F2E;AL # TIBETAN DIGIT HALF FIVE
0F2F;AL # TIBETAN DIGIT HALF SIX
0F30;AL # TIBETAN DIGIT HALF SEVEN
0F31;AL # TIBETAN DIGIT HALF EIGHT
0F32;AL # TIBETAN DIGIT HALF NINE
0F33;AL # TIBETAN DIGIT HALF ZERO
0F34;BA # TIBETAN MARK BSDUS RTAGS
0F35;CM # TIBETAN MARK NGAS BZUNG NYI ZLA
0F36;AL # TIBETAN MARK CARET -DZUD RTAGS BZHI MIG CAN
0F37;CM # TIBETAN MARK NGAS BZUNG SGOR RTAGS
0F38;AL # TIBETAN MARK CHE MGO
0F39;CM # TIBETAN MARK TSA -PHRU
0F3A;OP # TIBETAN MARK GUG RTAGS GYON
0F3B;CL # TIBETAN MARK GUG RTAGS GYAS
0F3C;OP # TIBETAN MARK ANG KHANG GYON
0F3D;CL # TIBETAN MARK ANG KHANG GYAS
0F3E;CM # TIBETAN SIGN YAR TSHES
0F3F;CM # TIBETAN SIGN MAR TSHES
0F40;AL # TIBETAN LETTER KA
0F41;AL # TIBETAN LETTER KHA
0F42;AL # TIBETAN LETTER GA
0F43;AL # TIBETAN LETTER GHA
0F44;AL # TIBETAN LETTER NGA
0F45;AL # TIBETAN LETTER CA
0F46;AL # TIBETAN LETTER CHA
0F47;AL # TIBETAN LETTER JA
0F49;AL # TIBETAN LETTER NYA
0F4A;AL # TIBETAN LETTER TTA
0F4B;AL # TIBETAN LETTER TTHA
0F4C;AL # TIBETAN LETTER DDA
0F4D;AL # TIBETAN LETTER DDHA
0F4E;AL # TIBETAN LETTER NNA
0F4F;AL # TIBETAN LETTER TA
0F50;AL # TIBETAN LETTER THA
0F51;AL # TIBETAN LETTER DA
0F52;AL # TIBETAN LETTER DHA
0F53;AL # TIBETAN LETTER NA
0F54;AL # TIBETAN LETTER PA
0F55;AL # TIBETAN LETTER PHA
0F56;AL # TIBETAN LETTER BA
0F57;AL # TIBETAN LETTER BHA
0F58;AL # TIBETAN LETTER MA
0F59;AL # TIBETAN LETTER TSA
0F5A;AL # TIBETAN LETTER TSHA
0F5B;AL # TIBETAN LETTER DZA
0F5C;AL # TIBETAN LETTER DZHA
0F5D;AL # TIBETAN LETTER WA
0F5E;AL # TIBETAN LETTER ZHA
0F5F;AL # TIBETAN LETTER ZA
0F60;AL # TIBETAN LETTER -A
0F61;AL # TIBETAN LETTER YA
0F62;AL # TIBETAN LETTER RA
0F63;AL # TIBETAN LETTER LA
0F64;AL # TIBETAN LETTER SHA
0F65;AL # TIBETAN LETTER SSA
0F66;AL # TIBETAN LETTER SA
0F67;AL # TIBETAN LETTER HA
0F68;AL # TIBETAN LETTER A
0F69;AL # TIBETAN LETTER KSSA
0F6A;AL # TIBETAN LETTER FIXED-FORM RA
0F71;CM # TIBETAN VOWEL SIGN AA
0F72;CM # TIBETAN VOWEL SIGN I
0F73;CM # TIBETAN VOWEL SIGN II
0F74;CM # TIBETAN VOWEL SIGN U
0F75;CM # TIBETAN VOWEL SIGN UU
0F76;CM # TIBETAN VOWEL SIGN VOCALIC R
0F77;CM # TIBETAN VOWEL SIGN VOCALIC RR
0F78;CM # TIBETAN VOWEL SIGN VOCALIC L
0F79;CM # TIBETAN VOWEL SIGN VOCALIC LL
0F7A;CM # TIBETAN VOWEL SIGN E
0F7B;CM # TIBETAN VOWEL SIGN EE
0F7C;CM # TIBETAN VOWEL SIGN O
0F7D;CM # TIBETAN VOWEL SIGN OO
0F7E;CM # TIBETAN SIGN RJES SU NGA RO
0F7F;BA # TIBETAN SIGN RNAM BCAD
0F80;CM # TIBETAN VOWEL SIGN REVERSED I
0F81;CM # TIBETAN VOWEL SIGN REVERSED II
0F82;CM # TIBETAN SIGN NYI ZLA NAA DA
0F83;CM # TIBETAN SIGN SNA LDAN
0F84;CM # TIBETAN MARK HALANTA
0F85;BA # TIBETAN MARK PALUTA
0F86;CM # TIBETAN SIGN LCI RTAGS
0F87;CM # TIBETAN SIGN YANG RTAGS
0F88;AL # TIBETAN SIGN LCE TSA CAN
0F89;AL # TIBETAN SIGN MCHU CAN
0F8A;AL # TIBETAN SIGN GRU CAN RGYINGS
0F8B;AL # TIBETAN SIGN GRU MED RGYINGS
0F90;CM # TIBETAN SUBJOINED LETTER KA
0F91;CM # TIBETAN SUBJOINED LETTER KHA
0F92;CM # TIBETAN SUBJOINED LETTER GA
0F93;CM # TIBETAN SUBJOINED LETTER GHA
0F94;CM # TIBETAN SUBJOINED LETTER NGA
0F95;CM # TIBETAN SUBJOINED LETTER CA
0F96;CM # TIBETAN SUBJOINED LETTER CHA
0F97;CM # TIBETAN SUBJOINED LETTER JA
0F99;CM # TIBETAN SUBJOINED LETTER NYA
0F9A;CM # TIBETAN SUBJOINED LETTER TTA
0F9B;CM # TIBETAN SUBJOINED LETTER TTHA
0F9C;CM # TIBETAN SUBJOINED LETTER DDA
0F9D;CM # TIBETAN SUBJOINED LETTER DDHA
0F9E;CM # TIBETAN SUBJOINED LETTER NNA
0F9F;CM # TIBETAN SUBJOINED LETTER TA
0FA0;CM # TIBETAN SUBJOINED LETTER THA
0FA1;CM # TIBETAN SUBJOINED LETTER DA
0FA2;CM # TIBETAN SUBJOINED LETTER DHA
0FA3;CM # TIBETAN SUBJOINED LETTER NA
0FA4;CM # TIBETAN SUBJOINED LETTER PA
0FA5;CM # TIBETAN SUBJOINED LETTER PHA
0FA6;CM # TIBETAN SUBJOINED LETTER BA
0FA7;CM # TIBETAN SUBJOINED LETTER BHA
0FA8;CM # TIBETAN SUBJOINED LETTER MA
0FA9;CM # TIBETAN SUBJOINED LETTER TSA
0FAA;CM # TIBETAN SUBJOINED LETTER TSHA
0FAB;CM # TIBETAN SUBJOINED LETTER DZA
0FAC;CM # TIBETAN SUBJOINED LETTER DZHA
0FAD;CM # TIBETAN SUBJOINED LETTER WA
0FAE;CM # TIBETAN SUBJOINED LETTER ZHA
0FAF;CM # TIBETAN SUBJOINED LETTER ZA
0FB0;CM # TIBETAN SUBJOINED LETTER -A
0FB1;CM # TIBETAN SUBJOINED LETTER YA
0FB2;CM # TIBETAN SUBJOINED LETTER RA
0FB3;CM # TIBETAN SUBJOINED LETTER LA
0FB4;CM # TIBETAN SUBJOINED LETTER SHA
0FB5;CM # TIBETAN SUBJOINED LETTER SSA
0FB6;CM # TIBETAN SUBJOINED LETTER SA
0FB7;CM # TIBETAN SUBJOINED LETTER HA
0FB8;CM # TIBETAN SUBJOINED LETTER A
0FB9;CM # TIBETAN SUBJOINED LETTER KSSA
0FBA;CM # TIBETAN SUBJOINED LETTER FIXED-FORM WA
0FBB;CM # TIBETAN SUBJOINED LETTER FIXED-FORM YA
0FBC;CM # TIBETAN SUBJOINED LETTER FIXED-FORM RA
0FBE;BA # TIBETAN KU RU KHA
0FBF;BA # TIBETAN KU RU KHA BZHI MIG CAN
0FC0;AL # TIBETAN CANTILLATION SIGN HEAVY BEAT
0FC1;AL # TIBETAN CANTILLATION SIGN LIGHT BEAT
0FC2;AL # TIBETAN CANTILLATION SIGN CANG TE-U
0FC3;AL # TIBETAN CANTILLATION SIGN SBUB -CHAL
0FC4;AL # TIBETAN SYMBOL DRIL BU
0FC5;AL # TIBETAN SYMBOL RDO RJE
0FC6;CM # TIBETAN SYMBOL PADMA GDAN
0FC7;AL # TIBETAN SYMBOL RDO RJE RGYA GRAM
0FC8;AL # TIBETAN SYMBOL PHUR PA
0FC9;AL # TIBETAN SYMBOL NOR BU
0FCA;AL # TIBETAN SYMBOL NOR BU NYIS -KHYIL
0FCB;AL # TIBETAN SYMBOL NOR BU GSUM -KHYIL
0FCC;AL # TIBETAN SYMBOL NOR BU BZHI -KHYIL
0FCF;AL # TIBETAN SIGN RDEL NAG GSUM
0FD0;BB # TIBETAN MARK BSKA- SHOG GI MGO RGYAN
0FD1;BB # TIBETAN MARK MNYAM YIG GI MGO RGYAN

[END]