Re: Merging combining classes, was: New contribution N2676

From: Stefan Persson (
Date: Sat Oct 25 2003 - 11:15:16 CST

Philippe Verdy wrote:

>The problem with this solution is that stability is not guaranteed across
>backward versions of Unicode: if a tool A implements the new version of
>combining classes and normalizes its input, it will keep the relative
>ordering of characters. If its output is injected into a tool B that still
>the legacy classes, the tool B may either reject the input (not normalized)
>or force the normalization. Then is the text comes back to tool A, it will
>see a modified text.
Wouldn't it be possible to, if this is of any importance in a specific
situation, specify a Unicode version, and not utilise additional
normalisation data that is only specified in later versions than the
specified version? For example,

   x = normalise("some text", 4.0);

normalises the text according to the rules specified in Unicode 4.0, or,
if the software has not yet been updated with this information,
according to the rules in an earlier version of Unicode, while

   x = normalise("some text");

would normalise the text according to the most recent version of Unicode
for which the "normalise" program has any data.


This archive was generated by hypermail 2.1.5 : Thu Jan 18 2007 - 15:54:24 CST