    On 2003.08.06, 11:37, Philippe Verdy <> wrote:

    > The main UCD table already contains the needed NFD canonical
    > decompositions, and removing accents is simply a matter of NFD
    > decomposition plus removal of combining characters
    > they are not really accents but are important to correctly identify
    > vowels and consonnants,

    Note that even most latin script orthographies will suffer badly if
    diacriticals are removed. I'm sure we can all come out with examples,
    many of which quite embarrassing or even dangerous. (F.i., portuguese
    «Do you have a porpoise?» becomes quite nasty if you remove the one
    acute from it...) Learning that diacriticals do, in most languages, a
    lot more than just add snazziness to a word is probably lesson #1 in

