From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Tue Nov 25 2003 - 13:21:44 EST
John Cowan writes:
> Since it adds efficiency to normalize only once,
> it is worthwhile to define a few normalization forms and urge
> people to produce text in one of them, so that receivers need not
> normalize but need only check for normalization, typically much cheaper.
I'm not convinced that there's a significant improvement when only checking
for noramlization but not perfomring it. It requires at least a list of the
characters are acceptable in a normalization form, and as well their
combining classes.
This data, which still requires a table to perform the check, is not much
smaller than the data with needed decompositions. And as well, if one can
perform a normalization check and detect that combining characters can be
reordered, it's not a bug performance hit to reorder them, even if we must
decompose them first. In any cases, you still need to perform lookup of
characters in a table of character properties.
The real performance gain comes when applications do not even need to
perform this check, as all strings are marked by their currently supported
or not supported normalizatrion forms.
__________________________________________________________________
<< ella for Spam Control >> has removed Spam messages and set aside
Newsletters for me
You can use it too - and it's FREE! http://www.ellaforspam.com
This archive was generated by hypermail 2.1.5 : Tue Nov 25 2003 - 14:08:30 EST