RE: Canonical equivalence in rendering: mandatory or recommended?

From: Jill Ramonsky (
Date: Wed Oct 15 2003 - 06:08:54 CST

> -----Original Message-----
> From: Peter Kirk []
> Sent: Wednesday, October 15, 2003 12:19 PM
> To: Unicode List
> Subject: Canonical equivalence in rendering: mandatory or recommended?
> Does everyone agree that "This is not a performance issue"?

In my experience, there /is/ a performance hit.

I had to write an API for my employer last year to handle some aspects
of Unicode. We normalised everything to NFD, not NFC (but that's easier,
not harder). Nonetheless, all the string handling routines were not
allowed to /assume/ that the input was in NFD, but they had to guarantee
that the output was. These routines, therefore, had to do a "convert to
NFD" on every input, even if the input were already in NFD. This did
have a significant performance hit, since we were handling (Unicode)
strings throughout the app.

I think that next time I write a similar API, I wll deal with
(string+bool) pairs, instead of plain strings, with the bool meaning
"already normalised". This would definitely speed things up. Of course,
for any strings coming in from "outside", I'd still have to assume they
were not normalised, just in case.


This archive was generated by hypermail 2.1.5 : Thu Jan 18 2007 - 15:54:24 CST