From: Adam Twardoch (list.adam@twardoch.com)
Date: Fri Aug 15 2008 - 22:24:05 CDT
The quotation marks data in CLDR:
http://www.unicode.org/cldr/data/charts/by_type/misc.delimiters.html
seems to be quite inaccurate for quite a few languages. I'd like to
conduct a survey among typographic experts worldwide to collect some
"authoritative" and "definitive" information on quotation mark practices
in various locales. I'd like to feed this data back to CLDR.
However, the presentation format on the page
http://www.unicode.org/cldr/data/charts/by_type/misc.delimiters.html
is not very useful for me. Can somebody please convert it to the
following format for me:
{localeCode} | {localeName} | {quotationStart}x{quotationEnd} |
{alternateQuotationStart}x{alternateQuotationEnd}
For example for the two German locales registered there, the lines would
say:
de | German | „x“ | ‚x‘
de_CH | Swiss High German | «x» | ‹x›
One locale per line, please.
Please send the output as an UTF-8-encoded plain text file to
adam@twardoch.com. I will take it from there and will prepare a survey,
making sure that it'll reach the appropriate experts in all the locales.
Then, I'll gather and edit the results, and submit them in appropriate
form to the Unicode Consortium.
BTW, does anyone have Python modules for parsing the CLDR data?
Regards,
Adam
-- Adam Twardoch | Language Typography Unicode Fonts OpenType | twardoch.com | silesian.com | fontlab.net I hate to advocate drugs, alcohol, violence, or insanity to anyone, but they've always worked for me. (Hunter S. Thompson)
This archive was generated by hypermail 2.1.5 : Fri Aug 15 2008 - 22:28:18 CDT