Re: CLDR 1.3 Beta now available (fr collation)

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Fri Apr 22 2005 - 16:24:19 CST

  • Next message: Magda Danish \(Unicode\): "(no subject)"

    From: "Rick McGowan" <rick@unicode.org>
    > The 1.3 version of CLDR is now at at beta status, and available via
    > <http://unicode.org/cldr/version/1.3.html>.

    I just downloaded it, and I wonder why, in the "fr" collation file, we only
    see these tailoring rules:
       <rules>
        <reset>ae</reset>
        <s></s>
        <t></t>
    <!--
        <reset>A</reset>
        <x><s></s><extend>E</extend></x>
        <reset>a</reset>
        <x><s></s><extend>e</extend></x>
    -->
       </rules>

    The rules for ae and AE French ligatures are correct, meaning that the ae or
    AE ligatures have a secondary difference with the non-ligated vowels, no
    primary difference with them, and then there's a tertiary difference between
    the ae and AE ligatures.

    But why isn't there something similar for the much more common oe and OE
    French ligatures?

       <rules>
        <reset>ae</reset>
        <s></s>
        <t></t>
        <reset>oe</reset>
        <s>/s>
        <t></t>
    <!--
        <reset>A</reset>
        <x><s></s><extend>E</extend></x>
        <reset>a</reset>
        <x><s></s><extend>e</extend></x>
        <reset>O</reset>
        <x><s></s><extend>E</extend></x>
        <reset>o</reset>
        <x><s></s><extend>e</extend></x>
    -->
       </rules>

    Aren't they missing? I can't see them in the "root" collation file. Are the
    oe/OE ligatures already in the Default UCA Collation Elements Table (unlike
    the ae and AE French ligatures that I know will sort as separate letters in
    other languages like Dutch, and for which tailoring is justified here).

    Also isn't the commented form preferable, as the uncommented form uses a
    character pair which should inevitably be converted into the second form in
    a DFA-based collator engine? Is UCA now recommending NFA-based collator
    engines? I think that the deterministic form for French is small enough to
    be used instead of the non-deterministic form (note: any DFA form is also a
    NFA form, the reverse is false of course).



    This archive was generated by hypermail 2.1.5 : Fri Apr 22 2005 - 16:26:49 CST