PROPOSED DRAFT Unicode Technical Report #9

Unicode Bidirectional Algorithm

Revision 2
Authors Mark Davis (
Date 1998-10-30
This Version
Previous Version n/a
Latest Version


This document presents the BIDI algorithm of The Unicode Standard, Version 2.0, with the Unicode 2.1 corrigenda folded in. It also contains some possible changes currently under discussion by the Unicode Technical Committee (UTC) Bidi subcommittee. Differences from the text of 2.0 are marked with underlines. In the future, this document should be accompanied by a reference implementation.

Status of this document

This document is a preliminary working draft. It is posted for review by the members of the Unicode Technical Committee (UTC). At its next meeting, the UTC may reject this document, review it for suitability to progress to draft status and/ or further amend this document. Please mail any comments to the authors.

3.1 Bidirectional Behavior

The Unicode Standard prescribes a memory representation order known as logical order. When text is presented in horizontal lines, most scripts display characters from left to right. However, there are several scripts (such as Arabic or Hebrew) where the natural ordering of horizontal text is from right to left. If all of the text has the same horizontal direction, then the ordering of the display text is unambiguous. However, when bidirectional text (a mixture of left-to-right and right-to-left horizontal text) is present, some ambiguities can arise in determining the ordering of the displayed characters.

This section describes the algorithm used to determine the directionality for bidirectional Unicode text. The algorithm extends the implicit model currently employed by a number of existing implementations and adds explicit controls for special circumstances. In most cases, there is no need to include additional information with the text to obtain correct display ordering. However, when necessary, additional information can be included in the text by means of a small set of directional formatting codes.

In general, the Unicode Standard does not supply formatting codes; formatting is left up to higher-level protocols. However, in the case of bidirectional text, there are circumstances where an implicit bidirectional ordering is not sufficient to produce comprehensible text. To deal with these cases, a minimal set of directional formatting codes is defined to control the ordering of characters when rendered. This allows exact control of the display ordering for legible interchange and also ensures that plain text used for simple items like filenames or labels can always be correctly ordered for display.

The directional formatting codes are used only to influence the display ordering of text. In all other respects they are ignored—they have no effect on the comparison of text, nor on word breaks, parsing, or numeric analysis. The characters are still interpreted in logical order--only the display is affected. The ordering of bidirectional text depends upon the directional properties of the text. Section 4.3, Directionality lists the ranges of characters that have each particular directional character type.

Directional Formatting Codes

Two types of explicit codes are used to modify the standard implicit Unicode bidirectional algorithm. In addition, there are implicit ordering codes, the right-to-left and left-to-right marks. All of these codes are limited to the current directional block; that is, their effects are terminated by a block separator. The directional types left-to-right and right-to-left are called strong types, and characters of those types are called strong directional characters. The directional types associated with numbers are called weak types, and characters of those types are called weak directional characters.

Although the term embedding is used for explicit codes, the text within the scope of the codes is not independent of the surrounding text. Characters within an embedding can affect the ordering of characters outside, and vice versa. The algorithm is designed so that the use of explicit codes can be equivalently represented by out-of-line information, such as stylesheet information. However, any alternative representation will be defined by reference to the behavior of the explicit codes in this algorithm.

Explicit Directional Embedding

The following codes signal that a piece of text is to be treated as embedded. For example, an English quotation in the middle of an Arabic sentence could be marked as being embedded left-to-right text. If there were a Hebrew phrase in the middle of the English quotation, then that phrase could be marked as being embedded right-to-left. The following codes allow for nested embeddings.

LRE Left-to-Right Embedding   Treat the following text as embedded left-to-right.
RLE Right-to-Left Embedding   Treat the following text as embedded right-to-left.

The precise meaning of these codes will be made clear in the discussion of the algorithm. The effect of right-left line direction, for example, can be accomplished by simply embedding the text with RLE...PDF as seen next.

Explicit Directional Overrides

The following codes allow the bidirectional character types to be overridden when required for special cases, such as for part numbers. The following codes allow for nested directional overrides.

RLO Right-to-Left Override   Force following characters to be treated as strong right-to-left characters.
LRO Left-to-Right Override   Force following characters to be treated as strong left-to-right characters.

The precise meaning of these codes will be made clear in the discussion of the algorithm. The right-to-left override, for example, can be used to force a part number made of mixed English, digits and Hebrew letters to be written from right to left.

Terminating Explicit Directional Code

The following code terminates the effects of the last explicit code (either embedding or override) and restores the bidirectional state to what it was before that code was encountered.


Pop Directional Format   Restore the bidirectional state to what it was before the last LRE, RLE, RLO, LRO.

Implicit Directional Marks

These characters are very light-weight codes. They act exactly like right-to-left or left-to-right characters, except that they do not display (or have any other semantic effect). Their use is often more convenient than the explicit embeddings or overrides, since their scope is much more local (as will be made clear in the following).


Right-to-Left Mark   Right-to-left zero-width character


Left-to-Right Mark   Left-to-right zero-width character

There is no special mention of the implicit directional marks in the following algorithm. That is because their effect on bidirectional ordering is exactly the same as a corresponding strong directional character; the only difference is that they do not appear in the display.

Basic Display Algorithm

This algorithm may be coded differently for speed, but logically speaking it proceeds in two main phases. The input is a stream of text, up to a block separator (such as a paragraph separator). The algorithm only reorders text within a block; characters on one side of a block separator have no effect on characters on the other side. (Also, see Section 4.3, Directionality on the handling of CR, LF, and CRLF).

Embedding levels are numbers that indicate the embedding level of text. ("Embedding levels" in this text are explicitly set by both override controls and by embedding controls.) Odd-numbered levels are right-to-left, and even-numbered levels are left-to-right.

The minimum embedding level of text is zero, and the maximum depth is level 15. (The reason for having a limitation is to provide a precise stack limit for implementations to guarantee the same results. Fifteen levels is far more than sufficient for ordering; the display becomes rather muddied with more than a small number of embeddings!)

For example, in a particular piece of text, Level 0 is plain English text, Level 1 is plain Arabic text, possibly embedded within English level 0 text. Level 2 is English text, possibly embedded within Arabic level 1 text, and so on. Unless their direction is overridden, English text and numbers will always be an even level; Arabic text (excluding numbers) will always be an odd level. The exact meaning of the embedding level will become clear when the reordering algorithm is discussed, but the following provides an example of how the algorithm works.


In the following examples, case is used to indicate different implicit character types for those unfamiliar with right-to-left letters. Uppercase letters stand for right-to-left characters (such as Arabic or Hebrew), while lowercase letters stand for left-to-right characters (such as English or Russian).

Memory:            car is THE CAR in arabic
Character types:   LLL-LL-RRR-RRR-LL-LLLLLL
Resolved levels:   000000011111110000000000

Notice that the neutral character (space) between THE and CAR gets the level of the surrounding characters. This is how the implicit directional marks have an effect; by inserting appropriate directional marks around neutral characters, the level of the neutral characters can be changed.

Combining characters always attach to preceding base character in the memory representation; this is logically before the bidirectional algorithm is applied. Hence even after reordering for display and performing character shaping, the glyph representing a combining character will attach to the glyph representing its base character in memory. Depending on the line orientation and the placement direction of base letterform glyphs, it may, for example, attach to the glyph on the left, or on the right, or above.

Bidirectional Character Types

For the purpose of the bidirectional algorithm, characters have the types shown in Table 3-4. (For a specification of the bidirectional character types for a given Unicode value, see Chapter 4, Character Properties.) During the course of the algorithm, combining marks will be given the type of their base character.

Table 3-1. Bidirectional Character Types





L Strong Left-to-Right Most alphabetic, syllabic, Han ideographic characters, LRM, LRO, LRE
R Strong Right-to-Left Arabic and Hebrew alphabets, punctuation specific to those scripts, RLM, RLO, RLE
EN Weak European Number European digits, Eastern Arabic-Indic digits, ...
ES Weak European Number Separator Figure Space, Full Stop (Period), Solidus (Slash), ...
ET Weak European Number Terminator Plus Sign, Minus Sign, Degree, Currency symbols, ...
AN Weak Arabic Number Arabic-Indic digits, Arabic decimal & thousands separators, ...
CS Weak Common Number Separator Colon, Comma,...
B Separator Block Separator Paragraph Separator, Line Separator
S Separator Segment Separator Tab
WS Neutral Whitespace Space, No-Break Space, Line Separator, General Punctuation Spaces,...
ON Neutral Other Neutrals All other characters

Note: The term European digits is used to refer to decimal forms common in Europe and elsewhere, and Arabic-Indic digits to refer to the native Arabic forms. (See the Section 8.2, Arabic, for more details on naming digits.)

Table 3-6 lists additional abbreviations used in the examples and internal character types used in the algorithm.

Table 3-2. BIDI Example Abbreviations
Symbol Description
AL Arabic Letter
HL Hebrew Letter
BN Boundary Neutral
N Neutral or Separator (B, S, WS, ON)
CM Combining Mark
sot Start of text
eot End of text
e The text ordering type (L or R) that matches the embedding level direction

Resolving Embedding Levels

Combining character types and explicit codes to produce a list of resolved levels lies at the heart of the bidirectional algorithm. This resolution process consists of seven steps: determining the base level; determining explicit embedding levels and directions; determining explicit overrides; determining embedding and override terminations; resolving weak types; resolving neutral types; and resolving implicit embedding levels.

The Base Level

First, determine the base embedding level, which determines the default horizontal orientation of the text in the current block.

B1. In the text, find the first strong directional character, RLE, LRE, RLO or LRO. (Because block separators delimit text in this algorithm, this will generally be the first strong character after a block separator or at the very beginning of the text.)

B2. If the first strong directional character in the text is right-to-left, RLE, or RLO, then set the base level to one; otherwise, set it to zero.

The direction of the base embedding level is called the base direction. In some contexts this is also known as the paragraph direction or the block direction. The direction of the current embedding level (for a character in question) is called the embedding direction. It is L if the embedding level is even, and R if the embedding level is odd.

Explicit Levels and Directions

All explicit embedding levels are determined from the embedding and override codes. The directional level indicates both how deeply the text is embedded and the basic directional flow of the text. Each even level is a left-to-right embedding, and each odd level is a right-to-left embedding. Only levels from 0 to 15 are valid in this phase.

E1. Begin by setting the current embedding level to the base embedding level. Set the directional override status to neutral.

E2. With each RLE, compute the least greater odd level.

a. If this new level would be valid, then this code is valid. Remember (push) the current embedding level and override status. Reset the current level to this new level, and reset the override status to neutral. The RLE and subsequent characters are set to the new current level.

b. If the new level would not be valid, then this code is invalid. Don't change the current level or override status. Set the type of the RLE to BN. Until the matching PDF is reached, do the same to any other embedding or override code.

For example, level 0 => 1; levels 1, 2 => 3; levels 3, 4 => 5; ...13, 14 => 15; above 14, no change (don’t change levels with RLE if the new level would be invalid).

E3. With each LRE, compute the least greater even level.

a. If this new level would be valid, then this code is valid. Remember (push) the current embedding level and override status. Reset the current level to this new level, and reset the override status to neutral. The LRE and subsequent characters are set to the new current level.

b. If the new level would not be valid, then this code is invalid. Set the type of the LRE to BN. Don't change the current level or override status. Until the matching PDF is reached, do the same to any other embedding or override code.

For example, levels 0, 1 => 2; levels 2, 3 => 4; levels 4, 5 => 6; ...12, 13 => 14; above 13, no change (don’t change levels with LRE if the new level would be invalid).

Explicit Overrides

A directional override changes all of the following characters within the current explicit embedding level to a given value and sets the embedding level as with the embedding codes.

O1. With each RLO, compute the least greater odd level.

a. If this new level would be valid, then this code is valid. Remember (push) the current embedding level and override status. Reset the current level to this new level, and reset the override status to right-to-left. The RLO and subsequent characters are set to the new current level.

b. If the new level would not be valid, then this code is invalid. Set the type of the RLO to BN. Don't change the current level or override status. Until the matching PDF is reached, do the same to any other embedding or override code.

O2. With each LRO, compute the least greater even level.

a. If this new level would be valid, then this code is valid. Remember (push) the current embedding level and override status. Reset the current level to this new level, and reset the override status to left-to-right. The LRO and subsequent characters are set to the new current level.

b. If the new level would not be valid, then this code is invalid. Set the type of the LRO to BN. Don't change the current level or override status. Until the matching PDF is reached, do the same to any other embedding or override code.

O3. Whenever the directional override status is not neutral, reset the current character type to the directional override status.

Resetting levels works as described for embeddings in the previous section. For example, if the directional override status is neutral, then all intermediate characters retain their normal values: Arabic characters stay R, Latin characters stay L, neutrals stay N, and so on. If the directional override status is R, then all characters become R.

Terminating Embeddings and Overrides

There is a single code to terminate the scope of the current explicit code, whether an embedding or a directional override. All codes and pushed states are completely popped at block separators.

T4. With each PDF, determine the matching embedding or override code.

a. If there was a valid matching code, set the level and type of the PDF based on the current embedding level and override status. Restore (pop) the last remembered (pushed) embedding level and directional override.

b. If there was no valid matching code, set the PDF to BN.

Note: Higher level protocols may choose to interpret PDFs that occur when there is no pushed state. For example, a presentation engine may receive blocks of processed Unicode text divided into lines. If the complexity of the text is limited by the higher-level protocol, then PDF can be interpreted significantly.

The following is an example of the levels and types set by the embedding codes:

Memory: L   R   L <RLE> L   R   L <LRO> R   L   R <PDF> R <PDF> L   R
Levels: 0   0   0   1   1   1   1   2   2   2   2   2   1   1   0   0
Types:  L   R   L   R   L   R   L   L   L   L   L   L   R   R   L   R

T5. All explicit directional embeddings and overrides are completely terminated after any block separator. Return to the state as of B1.

Because the embedding codes have strong types, all overrides and resolution of weak types and neutrals take effect within the bounds of an embedding; that is, nothing within an embedding or override will affect the character direction of codes outside of that embedding, and vice versa. The one exception is in resolving neutrals (see N4 in the subsection "Resolving Neutral Types" on page 3-47 in this chapter), and in adjacent embeddings of the same type:

T6. If two embeddings with the same level are adjacent, then the PDF terminating the first embedding and the code initiating the next embedding are set to BN.

RLE ... PDF RLO ... PDF => R ...BN BN ... R

Note: This provision allows implementations with merging style runs, such as found in most word processors, to achieve the same effect as using embedding codes. If a non-merged display order is needed, that can be achieved by inserting a zero-width space or zero-width no-break space between the two codes.

Resolving Weak Types

Combining marks are now resolved based on the previous characters.

C0. A sequence of combining marks is given the type of the preceding base character; if they are not preceded by a base character (such as when they are at the start of a block), they are given the type ON.

N,CM => N,N
sot,CM => sot,N

The text is now parsed for numbers. This pass will change the directional types European Number Separator, European Number Terminator, and Common Number Separator to be European Number text, Arabic Number text, or Other Neutral text. The text to be scanned may have already had its type altered by directional overrides. If so, then it will not parse as numeric.

P0. Search backwards from each instance of a European number until the first character with type L or R (or block boundary) is found. If a character is found before a block boundary, and if that character belongs to the Arabic block, then change the type of the European number to Arabic number:

sot,EN => sot,EN
L,EN => L,EN

P0a. Change any sequence of Boundary Neutrals adjacent to an European Number to a European Number; change any remaining sequences of Boundary Neutrals adjacent to an Arabic Number to an Arabic Number:


P1. A single European separator between two European numbers changes to an European number. A single common separator between two numbers of the same type changes to that type:


P2. A sequence of European terminators adjacent to European numbers changes to all European numbers:

ET, ET, EN => EN, EN, EN
EN, ET, ET => EN, EN, EN
AN, ET, EN => AN, EN, EN

P3. Otherwise, separators, terminators and Boundary Neutrals change to Other Neutral:

ET, AN => N, AN
BN => N

Resolving Neutral Types

The next phase resolves the direction of the neutrals. The results of this phase are that all neutrals become either R or L. Generally, neutrals take on the direction of the surrounding text. In case of a conflict, they take on the embedding level. End-of-text and start-of-text are treated as if there were a character of the embedding level at that position.

N1. A sequence of neutrals takes the direction of the surrounding strong text.

R N R => R R R
L N L => L L L

N2. Where there is a conflict in adjacent strong directions, a sequence of neutrals takes the embedding direction.

L N R => L e R
R N L => R e L

Since end-of-text (eot) and start-of-text (sot) are treated as if they were characters of the base embedding level, the following examples are covered by this rule:

L N eot => L e eot
R N eot => R e eot
sot N L => sot e L
sot N R => sot e R

N3. For the purpose of resolving neutrals,

(a) European numbers are treated as though they were the type of the previous strong character.
(b) If there is no previous strong character, European number are treated as though they had the base direction.
(c) Arabic numbers are treated as though they were R but do not affect the treatment of European numbers as in (a) and (b).

The following are examples:

R N EN N R => R R EN R R
R N EN N L => R R EN e L
L N EN N R => L L EN e R
L N EN N L => L L EN L L
R N AN N R => R R AN R R
R N AN N L => R R AN e L
L N AN N R => L e AN R R
L N AN N L => L e AN e L

Examples. A list of numbers separated by neutrals and embedded in a directional run will come out in the run’s order.

Storage:	he said "THE VALUES ARE 123, 456, 789, OK".
Display:	he said "KO ,789 ,456 ,123 ERA SEULAV EHT".

In this case, both the comma and the space between the numbers take on the direction of the surrounding text (uppercase = right-to-left), ignoring the numbers. The commas are not considered part of the number since they are not surrounded on both sides (see number parsing). However, if there is an adjacent left-to-right sequence, then European numbers will adopt that direction:

Storage:	he said "IT IS A bmw 500, OK."
Display:	he said ".KO ,bmw 500 A SI TI"

Resolving Implicit Levels

In the final phase, the embedding level of text may be increased, based upon the resolved character type. Right-to-left text will always have an odd level, and left-to-right and numeric text will always have an even level. In addition, numeric text will always have a higher level than the base level, except in one special case. (Note that it is possible for text to end up at levels higher than 15 as a result of this process.) This results in the following rules:

I1. If the embedding direction is even (left-to-right), then the right-to-left text goes up one level. Numeric text (AN) goes up two levels . A sequence of one or more numeric types (EN) goes up two levels unless immediately preceded by left-to-right text.

I2. If the embedding direction is odd (right-to-left), then the left-to-right text and numeric text (EN or AN) goes up one level.

Table 3-7 summarizes the results of the implicit algorithm. The "(L)" indicates a preceding character type.

Table 3-3. Resolving Implicit Levels
Embedding Level (EL) Sequence Type Result
Even L EL
  R EL+1
  AN EL+2
  EN EL+2
  (L) EN...EN EL...EL
Odd R EL
  L EL+1
  AN EL+1
  EN EL+1

Reordering Resolved Levels

The following algorithm describes the logical process of finding the correct display order. As before, this logical process is not necessarily the actual implementation, which may diverge for efficiency. As opposed to resolution phases, this algorithm acts on a per-line basis.

The process of breaking a paragraph into one or more lines that fit within particular bounds is outside the scope of the bidirectional algorithm. Where character shaping is involved, it can be somewhat more complicated (see pages 6-22 through 6-32). Logically there are the following steps:

L1. Reset the embedding level of segment separators and block separators, any sequence of whitespace characters preceding a segment separator or block separator, and any sequence of white space characters at the end of the line to be the base embedding level.

Note: Since a Block Separator breaks lines, there will be at most one per line.

In combination with the following rule, this means that trailing white space will appear at the visual end of the line (in the base direction). Tabulation will always have a consistent direction within a directional block.

L2. From the highest level found in the text to the lowest odd level on each line, reverse any sequence of characters that are at that level or higher.

This reverses a progressively larger series of substrings. The following four examples illustrate this:

Memory:              car means CAR.
Resolved levels:     00000000001110
Reverse level 1:     car means RAC.

Memory:              car MEANS CAR.
Resolved levels:     22211111111111
Reverse level 2:     rac MEANS CAR.
Reverse levels 1-2:  .RAC SNAEM car

Memory:              he said "car MEANS CAR."
Resolved levels:     000000000222111111111100
Reverse level 2:     he said "rac MEANS CAR."
Reverse levels 1-2:  he said "RAC SNAEM car."

Memory:              DID YOU SAY ‘he said "car MEANS CAR"’?
Resolved levels:     11111111111112222222224443333333333211
Reverse level 4:     DID YOU SAY ‘he said "rac MEANS CAR"’?
Reverse levels 3-4:  DID YOU SAY ‘he said "RAC SNAEM car"’?
Reverse levels 2-4:  DID YOU SAY ’"rac MEANS CAR" dias eh‘?
Reverse levels 1-4:  ?‘he said "RAC SNAEM car"’ YAS UOY DID

A character that possesses the mirrored property as specified by Section 4.7, Mirrored, should be depicted by a mirrored glyph if the resolved directionality of that character is odd. For example, U+0028 left parenthesis—which is interpreted in the Unicode Standard as an opening parenthesis—appears as "(" when its resolved level is even, and as the mirrored glyph ")" when its resolved level is odd.

Combining marks applied to a right-to-left base character will at this point precede their base character. See Section 5.12 Rendering Non-Spacing Marks for an illustration of this. If the rendering engine expects them to follow the base characters in the final display process, then the ordering of the marks and the base character will need to be reversed.

Bidirectional Conformance

The bidirectional algorithm specifies part of the intrinsic semantics of right-to-left characters. In the absence of a higher-level protocol that specifically supercedes the interpretation of directionality, systems that interpret these characters must achieve results identical to the implicit bidirectional algorithm when rendering.

Explicit Formatting Codes

As with any Unicode characters, systems do not have to make use of any particular explicit directional formatting code (although it is not generally useful to include a terminating code without including the initiator). Generally, conforming systems will fall into three classes:

Higher-Level Protocols

The following are concrete examples of how systems may apply higher-level protocols to the ordering of bidirectional text.

When text using a higher-level protocol is to be converted to Unicode plain text, formatting codes should be inserted to ensure that the order matches that of the higher-level protocol, or (as in the last example) the appropriate characters can be substituted.

Vertical Text

In the case of vertical line orientation, the bidirectional algorithm is still used to determine the levels of the text. However, these levels are not used to reorder the text, since the characters are usually ordered uniformally from top to bottom. Instead, the levels are used to determine the rotation of the text. Sometimes vertical lines follow a vertical baseline in which each character is oriented as normal (with no rotation), with characters ordered from top to bottom whether they are Hebrew, numbers, or Latin. When setting text using the Arabic script in vertical lines, it is more common to employ a horizontal baseline that is rotated by 90° counterclockwise so that the characters are ordered from top to bottom. Latin text and numbers may be rotated 90° clockwise so that the characters are also ordered from top to bottom.

The bidirectional algorithm does come into effect when some characters are ordered from bottom to top. For example, this happens with a mixture of Arabic and Latin glyphs when all the glyphs are rotated uniformly 90° clockwise. (The choice of whether text is to be presented horizontally or vertically, or whether text is to be rotated, is not specified by the Unicode Standard, and is left up to higher-level protocols.


Because of the implicit character types and the heuristics for resolving neutral and numeric directional behavior, the implicit bidirectional ordering will generally produce the correct display without any further work. However, problematic cases may occur when a right-to-left paragraph begins with left-to-right characters, or there are nested segments of different-direction text, or there are weak characters on directional boundaries. In these cases, embeddings or directional marks may be required to get the right display. Part numbers may also require directional overrides.

The most common problematic case is that of neutrals on the boundary of an embedded language. This can be addressed by setting the level of the embedded text correctly. For example, with all the text at level 0 the following occurs:

Memory:  he said "car MEANS CAR!", and expired.
Display: he said "car RAC SNAEM!", and expired.

If the exclamation mark is to be part of the Arabic quotation, then the user can select the text MEANS CAR! and explicitly mark it as embedded Arabic, which produces the following result:

Memory:  he said "<RLE>MEANS CAR!<PDF>", and expired.
Display: he said "!RAC SNAEM", and expired.

Another method of doing this is to place a right directional mark (RLM) after the exclamation mark. Since the exclamation mark is now not on a directional boundary, this produces the correct result.

Memory:  he said "MEANS CAR!<RLM>", and expired.
Display: he said "!RAC SNAEM", and expired.

This latter approach is preferred, since it does not make use of the stateful controls, which can easily get out of sync if not fully supported by editors and other string manipulation. The stateful controls are generally only needed for more complex (and rare) cases such as double embeddings, such as in the example cited above:

Memory:  DID YOU SAY ‘he said "car MEANS CAR"’?
Display:  ?‘he said "RAC SNAEM car"’ YAS UOY DID


Copyright © 1998-1998 Unicode, Inc.. All Rights Reserved. The Unicode Consortium makes no expressed or implied warranty of any kind, and assumes no liability for errors or omissions. No liability is assumed for incidental and consequential damages in connection with or arising out of the use of the information or programs contained or accompanying this technical report.

Unicode and the Unicode logo are trademarks of Unicode, Inc., and are registered in some jurisdictions.

Unicode Home Page:

Unicode Technical Reports: