Normalization Demo

This page contains a simple applet that demonstrates the differences among the normalization forms discussed in  UTR #15: Unicode Normalization Forms, and the source code used to produce that applet.

The demo uses a subset of the data in the Unicode Character Database (UCD) to allow faster loading with browsers. It supports Latin-1 characters, plus a-circumflex-grave and a-circumflex-acute to show characters with multiple accents. Since many browsers won't actually display the combining accents, substitutes are used in display and editing. (see caveats).

Operation [This button needs Java support in the browser.]

Source

The following source files are used in the demo applet. In your browser, you may have to view them with View>Page Source to keep the browser from messing up the line breaks and indentation.

Caveats