[CLDR] Quotation marks data in CLDR inaccurate

From: Adam Twardoch (list.adam@twardoch.com)
Date: Fri Aug 15 2008 - 22:24:05 CDT

  • Next message: Jeroen Ruigrok van der Werven: "Re: [CLDR] Quotation marks data in CLDR inaccurate"

    The quotation marks data in CLDR:

    http://www.unicode.org/cldr/data/charts/by_type/misc.delimiters.html

    seems to be quite inaccurate for quite a few languages. I'd like to
    conduct a survey among typographic experts worldwide to collect some
    "authoritative" and "definitive" information on quotation mark practices
    in various locales. I'd like to feed this data back to CLDR.

    However, the presentation format on the page
    http://www.unicode.org/cldr/data/charts/by_type/misc.delimiters.html
    is not very useful for me. Can somebody please convert it to the
    following format for me:

    {localeCode} | {localeName} | {quotationStart}x{quotationEnd} |
    {alternateQuotationStart}x{alternateQuotationEnd}

    For example for the two German locales registered there, the lines would
    say:

    de | German | „x“ | ‚x‘
    de_CH | Swiss High German | «x» | ‹x›

    One locale per line, please.

    Please send the output as an UTF-8-encoded plain text file to
    adam@twardoch.com. I will take it from there and will prepare a survey,
    making sure that it'll reach the appropriate experts in all the locales.
    Then, I'll gather and edit the results, and submit them in appropriate
    form to the Unicode Consortium.

    BTW, does anyone have Python modules for parsing the CLDR data?

    Regards,
    Adam

    -- 
    Adam Twardoch
    | Language Typography Unicode Fonts OpenType
    | twardoch.com | silesian.com | fontlab.net
    I hate to advocate drugs, alcohol, violence, or
    insanity to anyone, but they've always worked for me.
    (Hunter S. Thompson)
    


    This archive was generated by hypermail 2.1.5 : Fri Aug 15 2008 - 22:28:18 CDT