[Unicode]   Policies Home | Site Map | Search
 

Unicode Collation Stability Policy

This page lists the policies of the Unicode Consortium regarding stability of the Unicode Collation Algorithm.

This page was last updated 21-Oct-2011.

Unicode Collation Algorithm Stability

Each release of Unicode Technical Standard #10, Unicode Collation Algorithm is a stable release and may be used as reference material or be cited as a normative reference by other specifications. Each version, once published, is absolutely stable and will never change.

Unicode DUCET Stability

The Default Unicode Collation Element Table (DUCET) forms a part of each release of the Unicode Collation Algorithm. Each version of DUCET, once released, is also absolutely stable and will never change.

Policies Regarding Change Between Versions

The DUCET table necessarily needs to be updated for each successive release of the Unicode Collation Algorithm to incorporate collation weights for new repertoire in the corresponding version of the Unicode Standard. Because of the nature of the construction of the DUCET table and the relationships between collation weights, implementors should note that there are no formal guarantees of collation weight stability between versions. In particular:

  • Absolute weight values in the DUCET table change between versions, particularly for primary weight values. Implementations cannot depend on these absolute weights to remain identical between versions.
  • The changes in relative weights of characters are minimized between versions, but relative weights, too, can change between versions.

Because implementations of collation are best served by minimizing the amount of change between versions, the Unicode Technical Committee has established additional policies which it uses to ensure consistency and to minimize the churning of collation weights. The goal of these policies is to keep relative weights as stable as feasible, especially for characters that have been encoded for some time (2 years or longer).

The first set of policies constrains how existing entries in the DUCET table can be changed between versions. Those can be found in Change Management for the Unicode Collation Algorithm.

The second set of policies specifies criteria by which initial collation weights are assigned to characters newly added to the Unicode Standard. Those can be found in UCA Default Table Criteria for New Characters.