Re: Looking for transcription or transliteration standards latin- >arabic

From: Peter Kirk (
Date: Fri Jul 09 2004 - 10:42:27 CDT

  • Next message: Mike Ayers: "RE: Diacritic and similar foldings and spam filtering"

    On 09/07/2004 15:40, Michael (michka) Kaplan wrote:

    >From: "Peter Kirk" <>
    >>But Kaplan is referring to something quite different, optionally
    >>ignoring diacritics in search operations. This is indeed desirable, so
    >>that a single search can match both Dvorak and Dvořák for example, and
    >>so that the one doing the search does not need to remember exactly which
    >>diacritics are used in the name. And it is already covered by the
    >>Unicode collation algorithm and default table, in which diacritics are
    >>distinguished only at the second level and so folded by a top level only
    >(a) If this were true and it were the only need, then case folding would
    >also just be "a UCA issue", yet case folding is in the document.

    I didn't say it was the only need, but it did seem to be the need you
    were highlighting, whereas Everson was highlighting a very different need.

    And of course companies are free to use algorithms other than the UCA,
    but they shouldn't expect Unicode to define more than one way of doing
    the same thing - although to an extent there seems to be that kind of
    duplication between the UCA and the folding mechanism. I wonder if it
    would have been better to define the UCA explicitly as one or more
    foldings followed by a comparison operation, which might make it easier
    for implementers to combine Unicode standard foldings with their
    existing comparison mechanisms. But I don't wish to destabilise what is
    already defined.

    >Does diacritic folding destroy information provided by the distinctions that
    >diacritcs provide? Of course it does. But then again, the same can be said
    >of all foldings. This does not diminish their potential usefulness in
    >specific tasks/operations.
    Agreed. It's just that I don't agree that preparing texts for
    typesetting (at least within my European context) is one of those
    specific tasks/operations for diacritic folding.

    Peter Kirk (personal) (work)

    This archive was generated by hypermail 2.1.5 : Fri Jul 09 2004 - 10:43:44 CDT