Re: Merging combining classes, was: New contribution N2676

From: Philippe Verdy (
Date: Sat Oct 25 2003 - 10:11:49 CST

From: "Peter Kirk" <>
> I wonder if it would in fact be possible to merge certain adjacent
> combining classes, as from a future numbered version N of the standard.
> That would not affect the normalisation of existing text; text
> normalised before version N would remain normalised in version N and
> later, although not vice versa. I know that this would break the letter
> of the current stability policy, but is this kind of backward
> compatibility actually necessary? The change could be sold to others as
> required for the internal consistency of Unicode.

The problem with this solution is that stability is not guaranteed across
backward versions of Unicode: if a tool A implements the new version of
combining classes and normalizes its input, it will keep the relative
ordering of characters. If its output is injected into a tool B that still
the legacy classes, the tool B may either reject the input (not normalized)
or force the normalization. Then is the text comes back to tool A, it will
see a modified text.

One could argue that a CCO control may be generated when converting
for backwards versions of Unicode. But will tool A know the version of
Unicode used by legacy tool B, if B is a remote service that does not
provide this version information to A?

The problem would then be the interoperability of Unicode-compliant
systems using distinct versions of Unicode (for example between
XML processors, text editors, input methods, renderers, text
converters, full text search engines. This may even be critical in
tools like sorting, in applications that require and expect that their
input is sorted according to its locale in a predictable way (for
example in applications using binary searches in sorted lists of
text items, such as authentication in a list of user names, or
a filenames index).

This archive was generated by hypermail 2.1.5 : Thu Jan 18 2007 - 15:54:24 CST