[CLDR] Quotation marks data in CLDR inaccurate

From: Adam Twardoch (list.adam@twardoch.com)
Date: Fri Aug 15 2008 - 22:24:05 CDT

Next message: Jeroen Ruigrok van der Werven: "Re: [CLDR] Quotation marks data in CLDR inaccurate"

Previous message: Michael Everson: "Fun with Unicode spoofing"
Next in thread: Jeroen Ruigrok van der Werven: "Re: [CLDR] Quotation marks data in CLDR inaccurate"
Reply: Jeroen Ruigrok van der Werven: "Re: [CLDR] Quotation marks data in CLDR inaccurate"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

The quotation marks data in CLDR:

http://www.unicode.org/cldr/data/charts/by_type/misc.delimiters.html

seems to be quite inaccurate for quite a few languages. I'd like to
conduct a survey among typographic experts worldwide to collect some
"authoritative" and "definitive" information on quotation mark practices
in various locales. I'd like to feed this data back to CLDR.

However, the presentation format on the page
http://www.unicode.org/cldr/data/charts/by_type/misc.delimiters.html
is not very useful for me. Can somebody please convert it to the
following format for me:

{localeCode} | {localeName} | {quotationStart}x{quotationEnd} |
{alternateQuotationStart}x{alternateQuotationEnd}

For example for the two German locales registered there, the lines would
say:

de | German | „x“ | ‚x‘
de_CH | Swiss High German | «x» | ‹x›

One locale per line, please.

Please send the output as an UTF-8-encoded plain text file to
adam@twardoch.com. I will take it from there and will prepare a survey,
making sure that it'll reach the appropriate experts in all the locales.
Then, I'll gather and edit the results, and submit them in appropriate
form to the Unicode Consortium.

BTW, does anyone have Python modules for parsing the CLDR data?

Regards,
Adam

-- 
Adam Twardoch
| Language Typography Unicode Fonts OpenType
| twardoch.com | silesian.com | fontlab.net
I hate to advocate drugs, alcohol, violence, or
insanity to anyone, but they've always worked for me.
(Hunter S. Thompson)

Next message: Jeroen Ruigrok van der Werven: "Re: [CLDR] Quotation marks data in CLDR inaccurate"
Previous message: Michael Everson: "Fun with Unicode spoofing"
Next in thread: Jeroen Ruigrok van der Werven: "Re: [CLDR] Quotation marks data in CLDR inaccurate"
Reply: Jeroen Ruigrok van der Werven: "Re: [CLDR] Quotation marks data in CLDR inaccurate"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Fri Aug 15 2008 - 22:28:18 CDT