Collation in ICU
Vladimir Weinstein - IBM Corporation

Intended Audience: Software Engineers, Managers, Systems Analysts, Technical Writers, Testers, Web Designers

Session Level: Beginner, Intermediate, Advanced

Collation is the general term for the process of determining the sorting order of strings of characters for a given language. It is a key function in computer systems; whenever a list of strings is presented to users, they are likely to want it in a sorted order so that they can easily and reliably find individual strings. It is also crucial for the operation of databases, not only in sorting records but also in selecting sets of records with fields within given bounds.

It is quite tricky to get collation to work correctly for many languages, and even more difficult to do it with the speed demanded by customers. Luckily, the ICU library provides a high-performance, full-functioned implementation of international collation, one that is used in IBM products and can be freely used in any other product. This presentation will review the capabilities of ICU collation and illustrate what can be done with it.

CLOSE WINDOW