Re: CLDR errors that can't be corrected

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Wed May 17 2006 - 20:22:15 CDT

  • Next message: Philippe Verdy: "Re: CLDR errors that can't be corrected"

    After rereading the email I sent, I think I must ask excuse if the tone of my text is not enough respectful of your work. In fact in the last few months, most activities have been concentrated on the new coming Unicode 5.0 version, and testing the BETA, or discussing it. The period for closing the CLDR at the same time as Unicode 5.0 release is probably not the best choice.

    There has even been a period where the Unicode websites were completely inaccessible (servers down at its colocation area).

    And also, the CLDR is still a child not completely born, with lots of beta data, frequent changes in the format, new aliases, and even the CLDR website does not allow completely testing all cases.

    The site may also need ways to clean our own errors or changes, instead of continuing to display options we created ourselves, and then we chose to not support. Multiplying the options available on screen just complicates the validation of our own data. So why can't we simply remove the items we created and that we no longer support?

    There should be a way to look in our submissions if there are items that we have still not reviewed, or those for which there exists conflicts of opinions with other users (may be they are right, may wecanfind an alternative compromize that may satisfy multiple users).

    My opinion is that there's no emergency to close all languages simultaneously. And the vetting process, once it is started, should include a "Reject" option instead of just a "confirm" option. If things are completely frozen for a next version due to absence of data in the open period, then there should be no radio-button at all, but a link to the Unicode Report form where only obvious bugs will be treated, either immediately, or given consideration later.

    Or the alternative would be to leave the submission forms open, but they will not be part of the next version, except if notable errors are reported and need to be corrected using the ongoing proposals.

    Ithinkk it's aillusory to think that all locales can be verified at the sametime with the same schedules. So the vetting process should start after there has been some change proposals since the last release, and enough time has been given to correct things. it's a fact that the current process uses a very slow release cycle of several months, but the method used creates emergency hotpoints in the current schedule, which does not facilitate the quality of submissions (and notably when the CLDR structure has changed so much like in the last few months, with many corrections, new aliasing model, new coherence checks, ...)

    For me the CLDR will be a useful tool in a long term, but it's really too soon to schedule it as if it had produced a coherent standard that must be maintained with stability rules like the Unicode standard and the UTC/WG2 working group schedules. For now there are still too many differences between various sources and related standards (including in ISO standards themselves, like the various orthographies of ISO3166 countries, ISO639 languages, ISO15924 scripts, ISO10646/Unicode character names and block names, the toponomy of timezones, normalisation of singular/plural forms, uniform separators for alternate names...). Things are going better, but they are not finished and not even stable.



    This archive was generated by hypermail 2.1.5 : Wed May 17 2006 - 20:26:45 CDT