From aaron@ijigg.com Tue Mar 4 16:26:58 2008 Received: with ECARTIS (v1.0.0; list cldr-users); Tue, 04 Mar 2008 19:51:34 -0600 (CST) Received: from hu-out-0506.google.com (hu-out-0506.google.com [72.14.214.226]) by unicode.org (8.12.11/8.12.11) with ESMTP id m24MQqBV023405 for ; Tue, 4 Mar 2008 16:26:58 -0600 Received: by hu-out-0506.google.com with SMTP id 19so682110hue.21 for ; Tue, 04 Mar 2008 14:26:52 -0800 (PST) Received: by 10.86.97.7 with SMTP id u7mr1921049fgb.39.1204669612414; Tue, 04 Mar 2008 14:26:52 -0800 (PST) Received: by 10.86.53.15 with HTTP; Tue, 4 Mar 2008 14:26:52 -0800 (PST) Message-ID: <756ec90c0803041426g7932103frc93938d1b11373ec@mail.gmail.com> Date: Tue, 4 Mar 2008 14:26:52 -0800 From: "Aaron Brick" To: cldr-users@unicode.org Subject: 'XXX' in Cyrillic-Latin MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_16014_22517651.1204669612397" X-archive-position: 398 X-Approved-By: rick@unicode.org X-ecartis-version: Ecartis v1.0.0 Sender: cldr-users-bounce@unicode.org Errors-to: cldr-users-bounce@unicode.org X-original-sender: aaron@ijigg.com Precedence: bulk X-list: cldr-users ------=_Part_16014_22517651.1204669612397 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: base64 Content-Disposition: inline YW0gSSBjb3JyZWN0IHRoYXQgdGhlICJYWFgiIGVudHJpZXMgaW4gQ3lyaWxsaWMtTGF0aW4ueG1s IGFyZSB0aGUgb25lcwpyZWZlcnJlZCB0byBieSAiVE9ETzogYWRkIHJlbWFpbmluZyBjaGFyYWN0 ZXJzIj8gaSBkaWRuJ3QgZmluZCBhIGJ1ZyBvbiB0aGlzCnRvcGljIGJ1dCBpdCBtdXN0IGJlIGtu b3duIHRvIG90aGVyIHVzZXJzIG9mIG1pbm9yIENJUyBsYW5ndWFnZXMuIGkgbm90aWNlZAp0aGlz IHdoZW4gdHJhbnNsaXRlcmF0aW5nIHRoZSBUYWppayAi0rbQvtGA0rcg0KPQvtC60LXRgCDQkdGD 0YgiIHlpZWxkZWQgItK2b3LStyBVb2tlcgpCdcWhIiBpbnN0ZWFkIG9mIHRoZSBtb3JlIGNvcnJl Y3QgIkpvcmogVW9rZXIgQnXFoSIuIGl0IHdvdWxkIGJlIGVhc3kgZW5vdWdoCnRvIGFkZCBuZXcg ZXF1aXZhbGVuY2VzIGJ1dCBpIGFtIG5vdCBhIHNsYXZpc3QgYW5kIG15IGNvbnRyaWJ1dGlvbnMg d291bGQKbm90IGhvbGQgdXAgaW4gY291cnQuCgp0aGFua3MgZm9yIHlvdXIgY29tbWVudHMsCmFh cm9uLgo= ------=_Part_16014_22517651.1204669612397 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: base64 Content-Disposition: inline YW0gSSBjb3JyZWN0IHRoYXQgdGhlICZxdW90O1hYWCZxdW90OyBlbnRyaWVzIGluIEN5cmlsbGlj LUxhdGluLnhtbCBhcmUgdGhlIG9uZXMgcmVmZXJyZWQgdG8gYnkgJnF1b3Q7VE9ETzogYWRkIHJl bWFpbmluZyBjaGFyYWN0ZXJzJnF1b3Q7PyBpIGRpZG4mIzM5O3QgZmluZCBhIGJ1ZyBvbiB0aGlz IHRvcGljIGJ1dCBpdCBtdXN0IGJlIGtub3duIHRvIG90aGVyIHVzZXJzIG9mIG1pbm9yIENJUyBs YW5ndWFnZXMuIGkgbm90aWNlZCB0aGlzIHdoZW4gdHJhbnNsaXRlcmF0aW5nIHRoZSBUYWppayAm cXVvdDvSttC+0YDStyDQo9C+0LrQtdGAINCR0YPRiCZxdW90OyB5aWVsZGVkICZxdW90O9K2b3LS tyBVb2tlciBCdcWhJnF1b3Q7IGluc3RlYWQgb2YgdGhlIG1vcmUgY29ycmVjdCAmcXVvdDtKb3Jq IFVva2VyIEJ1xaEmcXVvdDsuIGl0IHdvdWxkIGJlIGVhc3kgZW5vdWdoIHRvIGFkZCBuZXcgZXF1 aXZhbGVuY2VzIGJ1dCBpIGFtIG5vdCBhIHNsYXZpc3QgYW5kIG15IGNvbnRyaWJ1dGlvbnMgd291 bGQgbm90IGhvbGQgdXAgaW4gY291cnQuPGJyPgo8YnI+dGhhbmtzIGZvciB5b3VyIGNvbW1lbnRz LDxicj5hYXJvbi48YnI+PGJyPjxicj4K ------=_Part_16014_22517651.1204669612397-- From rick@unicode.org Wed Mar 5 10:53:56 2008 Received: with ECARTIS (v1.0.0; list cldr-users); Wed, 05 Mar 2008 11:00:20 -0600 (CST) Received: from izanami (c-71-202-247-55.hsd1.ca.comcast.net [71.202.247.55]) by unicode.org (8.12.11/8.12.11) with SMTP id m25GrobT012083; Wed, 5 Mar 2008 10:53:50 -0600 Message-Id: <200803051653.m25GrobT012083@unicode.org> To: unicode@unicode.org Subject: Update to UAX #29 now available Date: Wed, 5 Mar 2008 08:53:51 -0800 From: Rick McGowan received: by Apple.Mailer (2.95.2) X-archive-position: 399 X-ecartis-version: Ecartis v1.0.0 Sender: cldr-users-bounce@unicode.org Errors-to: cldr-users-bounce@unicode.org X-original-sender: rick@unicode.org Precedence: bulk X-list: cldr-users There is a new draft of the Proposed Update to Unicode Standard Annex #29 Unicode Text Segmentation, reflecting changes authorized in the last UTC meeting: * Sentence Segmentation. Revised the contents of SContinue, characters that 'continue' a sentence. * Word Segmentation. Added Newline, and rules WB3a and WB3b to break words within other newline sequences * Grapheme Cluster Segmentation. * Added Prepend and rule GB9b to handle Thai and Lao. * Major revision of Section 3 Grapheme Cluster Boundaries. Includes change of name to extended grapheme cluster, clearer distinction from legacy grapheme clusters, and significant reordering and enhancement of the text * Note that the GraphemeBreakTest file in the UCD now tests the extended grapheme clusters, since it is the recommended choice. The UAX document is at http://www.unicode.org/reports/tr29/tr29-12.html. The data files are in http://www.unicode.org/Public/5.1.0/ucd/auxiliary/. The HTML charts are at: http://www.unicode.org/Public/5.1.0/ucd/auxiliary/GraphemeBreakTest-5.1.0d28.html http://www.unicode.org/Public/5.1.0/ucd/auxiliary/WordBreakTest-5.1.0d26.html http://www.unicode.org/Public/5.1.0/ucd/auxiliary/LineBreakTest-5.1.0d30.html http://www.unicode.org/Public/5.1.0/ucd/auxiliary/SentenceBreakTest-5.1.0d26.html (The d numbers may be updated over the next month, so if these links don't work, go first to the directory.) Unicode 5.1.0 is currently in the pre-publication phase and is due for release at the end of March 2008. No more substantive changes are planned, beyond those already approved by the Unicode Technical Committee. However, if you have editorial comments on the text of Unicode 5.1.0, including this document, please report via the online reporting form (http://www.unicode.org/reporting.html). From eik@iki.fi Sun Mar 9 09:20:50 2008 Received: with ECARTIS (v1.0.0; list cldr-users); Sun, 09 Mar 2008 09:20:53 -0600 (CST) Received: from smtp5.pp.htv.fi (smtp5.pp.htv.fi [213.243.153.39]) by unicode.org (8.12.11/8.12.11) with ESMTP id m29FKiQP008071; Sun, 9 Mar 2008 09:20:50 -0600 Received: from inspiron (cs181253188.pp.htv.fi [82.181.253.188]) by smtp5.pp.htv.fi (Postfix) with ESMTP id 6956D5BC04C; Sun, 9 Mar 2008 17:20:43 +0200 (EET) Reply-To: From: "Erkki I. Kolehmainen" To: , , "'CLDR list'" , Cc: , "'Hillevi Vuori'" , "'Le Gall Gaid'" , Subject: Invitation to Participate in a Workshop on Functional Multilingual Extensions to Keyboard Layouts Date: Sun, 9 Mar 2008 17:20:38 +0200 Message-ID: <000001c881f9$216c61d0$0200a8c0@inspiron> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_0001_01C88209.E4F7A2D0" X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook, Build 10.0.6822 Importance: Normal Thread-Index: AciB+SEFxJTeC22vRuKyHpJfJi1p3Q== X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3198 X-archive-position: 400 X-ecartis-version: Ecartis v1.0.0 Sender: cldr-users-bounce@unicode.org Errors-to: cldr-users-bounce@unicode.org X-original-sender: eik@iki.fi Precedence: bulk X-list: cldr-users This is a multi-part message in MIME format. ------=_NextPart_000_0001_01C88209.E4F7A2D0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Dear All - with apologies to those who receive this in multiple copies, =20 CEN, the European standards organization, has set up a Workshop on Functional Multilingual Extensions to European Keyboard Layouts = (CEN/ISSS WS/MEEK). In spite of its name, the Workshop is not necessarily limited = to European aspects (although primarily to the Latin script), depending on = the participants. As per its business plan, participation to the Workshop is open to all interested parties, whether European or not. =20 The kick-off meeting was held in Brussels on 25 January 2008.=20 =20 The secretariat of the WS is held by CSC - The Finnish IT Centre for = Science - on behalf of SFS - The Finnish Standards Association. The officers of = the WS can be reached at: meek AT postit dot csc dot fi. The home site for = the WS is at http://www.csc.fi/meek.=20 =20 The CEN site for the WS is at:=20 http://www.cen.eu/cenorm/businessdomains/businessdomains/isss/activity/ws= +me ek.asp. =20 The CEN site holds the approved Busines Plan and a call for a Project = Team of 3 to 5 voluntary experts. Applications for the Project Team are to be submitted to CEN by 14 March 2008. Those wishing to participate in the Workshop, should contact the secretary, Mr. Tero Aalto at CSC. The = actual operation of the Worksop will commence after the Project Team has been = set up, which will thus be somewhat later than envisaged in the Business = Plan. =20 I'd hope for an active, representative participation, which is the only = way to achieve a quality result.=20 =20 As per its business plan, the Workshop will prepare a single-part CWA = (CEN Workshop Agreement) covering at least the following areas: =20 * a list of aspects to be taken into consideration when designing regional multilingual keyboards and related input methods (both = technical, legal and relating to usability); *=09 specific recommendation and guidance related to the above aspects; *=09 considerations and possibly guidance for publicly available terminals, e.g. in Internet Cafes, for travellers, etc. The Workshop will not define any national or language-specific keyboard layouts. Some such layouts may be used as examples. The WSA is expected to be published in early 2009.=20 Sincerely, Erkki Erkki I. Kolehmainen, Chair, CEN/ISSS WS/MEEK=20 Tilkankatu 12 A 3, FI-00300 Helsinki, Finland=20 Tel. +358 9 4368 2643, Mob. +358 400 825 943 =20 =20 ------=_NextPart_000_0001_01C88209.E4F7A2D0 Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Viesti
Dear = All - with=20 apologies to those who receive this in multiple = copies,
 
CEN, = the European=20 standards organization, has set up a Workshop on Functional = Multilingual=20 Extensions to European Keyboard Layouts (CEN/ISSS WS/MEEK). In spite of = its=20 name, the Workshop is not necessarily limited to European aspects = (although=20 primarily to the Latin script), depending on the participants. As per = its=20 business plan, participation to the Workshop is open to all interested = parties,=20 whether European or not.
 
The = kick-off meeting=20 was held in Brussels on 25 January 2008.
 
The = secretariat of=20 the WS is held by CSC - The Finnish IT Centre for Science - on = behalf of=20 SFS - The Finnish Standards Association. The officers of the WS can be = reached=20 at: meek AT postit dot csc dot fi. The home = site for=20 the WS is at http://www.csc.fi/meek
=
 
The = CEN site for the=20 WS is at:
http://www.cen.eu/cenorm/businessdomains/businessdomai= ns/isss/activity/ws+meek.asp.
 
The = CEN site holds=20 the approved Busines Plan and a call for a Project Team of 3 to 5 = voluntary=20 experts. Applications for the Project Team are to be = submitted to CEN=20 by 14 March 2008. Those wishing to participate in the Workshop, should = contact=20 the secretary, Mr. Tero Aalto at CSC. The actual operation of the = Worksop=20 will commence after the Project Team has been set up, which will thus be = somewhat later than envisaged in the Business Plan.
 
I'd = hope for an=20 active, representative participation, which is the only way to achieve a = quality=20 result. 
 
As per = its business=20 plan, the Workshop will prepare a single-part CWA (CEN Workshop = Agreement)=20 covering at least the following areas: 
  • a list of aspects to be taken into = consideration=20 when designing regional multilingual keyboards and related input = methods (both=20 technical, legal and relating to usability);
  • specific recommendation and guidance = related to=20 the above aspects;
  • considerations and possibly guidance = for publicly=20 available terminals, e.g. in Internet Cafes, for travellers,=20 etc.

The Workshop will not define any national = or=20 language-specific keyboard layouts. Some such layouts may be used as=20 examples.

The WSA = is expected to=20 be published in early 2009.

Sincerely,=20 Erkki

Erkki I. Kolehmainen, Chair,=20 CEN/ISSS WS/MEEK

Tilkankatu 12 A 3, FI-00300 Helsinki, Finland 
Tel. +358 9 4368 = 2643, Mob. +358 400 825 943

 

 
------=_NextPart_000_0001_01C88209.E4F7A2D0-- From naz@mira.net Wed Mar 12 04:45:34 2008 Received: with ECARTIS (v1.0.0; list cldr-users); Wed, 12 Mar 2008 04:45:34 -0600 (CST) Received: from smtp-auth.no-ip.com (smtp-auth.no-ip.com [204.16.252.95]) by unicode.org (8.12.11/8.12.11) with ESMTP id m2CAjXnH031953 for ; Wed, 12 Mar 2008 04:45:34 -0600 X-No-IP: mrnaz.com@noip-smtp X-Report-Spam-To: abuse@no-ip.com Received: from [58.111.62.13] (pa58-111-62-13.pa.vic.optusnet.com.au [58.111.62.13]) (using TLSv1 with cipher RC4-MD5 (128/128 bits)) (No client certificate requested) (Authenticated sender: mrnaz.com@noip-smtp) by smtp-auth.no-ip.com (Postfix) with ESMTP id 5FCB1BF50 for ; Wed, 12 Mar 2008 03:45:32 -0700 (PDT) Message-ID: <47D7B446.6010303@mira.net> Date: Wed, 12 Mar 2008 21:45:26 +1100 From: Naz Gassiep User-Agent: Thunderbird 2.0.0.12 (Windows/20080213) MIME-Version: 1.0 To: cldr-users@unicode.org Subject: DST indicator Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 401 X-ecartis-version: Ecartis v1.0.0 Sender: cldr-users-bounce@unicode.org Errors-to: cldr-users-bounce@unicode.org X-original-sender: naz@mira.net Precedence: bulk X-list: cldr-users This question is about the indicator for when a time or zone is currently in DST. In English I'd just indicate that using something like (DST) after the time or the timezone name. Is there a translation item for this in the CLDR? Thanks, - Naz. From asmodai@in-nomine.org Wed Mar 12 05:10:45 2008 Received: with ECARTIS (v1.0.0; list cldr-users); Wed, 12 Mar 2008 05:10:45 -0600 (CST) Received: from nexus.in-nomine.org (dhammapada.xs4all.nl [82.95.168.248]) by unicode.org (8.12.11/8.12.11) with ESMTP id m2CBAiIY009370 for ; Wed, 12 Mar 2008 05:10:45 -0600 Received: from localhost (localhost.domini.in-nomine.org [127.0.0.1]) by nexus.in-nomine.org (Postfix) with ESMTP id E99C9C12E; Wed, 12 Mar 2008 12:10:43 +0100 (CET) X-Virus-Scanned: by amavisd-new using ClamAV at in-nomine.org Received: from nexus.in-nomine.org ([127.0.0.1]) by localhost (nexus.domini.in-nomine.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id hTUT1tqXBP64; Wed, 12 Mar 2008 12:10:43 +0100 (CET) Received: by nexus.in-nomine.org (Postfix, from userid 1000) id 0867EC11E; Wed, 12 Mar 2008 12:10:43 +0100 (CET) Date: Wed, 12 Mar 2008 12:10:42 +0100 From: Jeroen Ruigrok van der Werven To: Naz Gassiep Cc: cldr-users@unicode.org Subject: Re: DST indicator Message-ID: <20080312111042.GM60713@nexus.in-nomine.org> References: <47D7B446.6010303@mira.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <47D7B446.6010303@mira.net> Organisation: Ninth Circle Enterprises User-Agent: Mutt/1.5.17 (2007-11-01) X-archive-position: 402 X-ecartis-version: Ecartis v1.0.0 Sender: cldr-users-bounce@unicode.org Errors-to: cldr-users-bounce@unicode.org X-original-sender: asmodai@in-nomine.org Precedence: bulk X-list: cldr-users -On [20080312 11:51], Naz Gassiep (naz@mira.net) wrote: >This question is about the indicator for when a time or zone is currently >in DST. In English I'd just indicate that using something like (DST) after >the time or the timezone name. Is there a translation item for this in the >CLDR? DST is not a standard term for referring to the specific timezone under daylight-savings. For example, West Europe uses CET and CEST and WET/WEST. For these kind of things you can find entries in the CLDR. -- Jeroen Ruigrok van der Werven / asmodai イェルーン ラウフロック ヴァン デル ウェルヴェン http://www.in-nomine.org/ | http://www.rangaku.org/ Better to understand a little than to misunderstand a lot... From verdy_p@wanadoo.fr Wed Mar 12 05:27:43 2008 Received: with ECARTIS (v1.0.0; list cldr-users); Wed, 12 Mar 2008 05:27:43 -0600 (CST) Received: from smtp23.orange.fr (smtp23.orange.fr [193.252.22.30]) by unicode.org (8.12.11/8.12.11) with ESMTP id m2CBRghL014133 for ; Wed, 12 Mar 2008 05:27:43 -0600 Received: from me-wanadoo.net (localhost [127.0.0.1]) by mwinf2338.orange.fr (SMTP Server) with ESMTP id B94A91C00091; Wed, 12 Mar 2008 12:27:36 +0100 (CET) Received: from HARNON (APoitiers-258-1-156-109.w90-5.abo.wanadoo.fr [90.5.123.109]) by mwinf2338.orange.fr (SMTP Server) with ESMTP id 60B031C0008E; Wed, 12 Mar 2008 12:27:36 +0100 (CET) X-ME-UUID: 20080312112736396.60B031C0008E@mwinf2338.orange.fr Reply-To: From: "Philippe Verdy" To: "'Naz Gassiep'" , References: <47D7B446.6010303@mira.net> Subject: RE: DST indicator Date: Wed, 12 Mar 2008 12:27:28 +0100 Organization: Ordinateur Personnel Message-ID: <03b601c88434$0e13e5d0$0a01a8c0@HARNON> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" X-Mailer: Microsoft Office Outlook 11 In-Reply-To: <47D7B446.6010303@mira.net> X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3198 Thread-Index: AciEMPyGQQvjTCspShudIYP7bMqmnwAAd9AA Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by unicode.org id m2CBRghL014133 X-archive-position: 403 X-ecartis-version: Ecartis v1.0.0 Sender: cldr-users-bounce@unicode.org Errors-to: cldr-users-bounce@unicode.org X-original-sender: verdy_p@wanadoo.fr Precedence: bulk X-list: cldr-users When it is written, it is most often indicated within the abbreviated timezone. E.g., see "CDT" vs. "CST" (used in America), or "CEST" vs. "CET" (used in Europe). Look into CDLR data about timezone names and abbreviations. Timezone names and abbrevitions are highly contextual, and most often not translatable. For this reason, numeric formats are preferable, and the numerica timezone indicator should not require any DST adjustment. Using ISO 8601 time formats should really help making your date/times internationally recognized correctly. Other formats should preferably be used only in the local user's locale according to user's preferences (if the system allows users to set this preference) or the system should otherwise make it clear to all users which timezone it accepts and uses by displaying an explicit timezone indicator in numeric format only. > -----Message d'origine----- > De : cldr-users-bounce@unicode.org > [mailto:cldr-users-bounce@unicode.org] De la part de Naz Gassiep > Envoy : mercredi 12 mars 2008 11:45 > : cldr-users@unicode.org > Objet : DST indicator > > This question is about the indicator for when a time or zone > is currently in DST. In English I'd just indicate that using > something like > (DST) after the time or the timezone name. Is there a > translation item for this in the CLDR? > Thanks, From naz@mira.net Sun Mar 16 04:24:30 2008 Received: with ECARTIS (v1.0.0; list cldr-users); Sun, 16 Mar 2008 04:24:30 -0600 (CST) Received: from smtp-auth.no-ip.com (smtp-auth.no-ip.com [204.16.252.95]) by unicode.org (8.12.11/8.12.11) with ESMTP id m2GAOTP7012275 for ; Sun, 16 Mar 2008 04:24:30 -0600 X-No-IP: mrnaz.com@noip-smtp X-Report-Spam-To: abuse@no-ip.com Received: from [192.168.0.21] (ppp59-167-71-92.lns1.mel6.internode.on.net [59.167.71.92]) (using TLSv1 with cipher RC4-MD5 (128/128 bits)) (No client certificate requested) (Authenticated sender: mrnaz.com@noip-smtp) by smtp-auth.no-ip.com (Postfix) with ESMTP id C37EBBE93 for ; Sun, 16 Mar 2008 03:24:23 -0700 (PDT) Message-ID: <47DCF554.9090607@mira.net> Date: Sun, 16 Mar 2008 21:24:20 +1100 From: Naz Gassiep User-Agent: Thunderbird 2.0.0.12 (Windows/20080213) MIME-Version: 1.0 To: cldr-users@unicode.org Subject: Translations of subcountries. Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 404 X-ecartis-version: Ecartis v1.0.0 Sender: cldr-users-bounce@unicode.org Errors-to: cldr-users-bounce@unicode.org X-original-sender: naz@mira.net Precedence: bulk X-list: cldr-users Are there any plans to include translations of subcountries (states and provinces) in the CLDR? Or is this just too big a job? How about city names? - Naz. From asmodai@in-nomine.org Sun Mar 16 04:52:03 2008 Received: with ECARTIS (v1.0.0; list cldr-users); Sun, 16 Mar 2008 04:52:04 -0600 (CST) Received: from nexus.in-nomine.org (dhammapada.xs4all.nl [82.95.168.248]) by unicode.org (8.12.11/8.12.11) with ESMTP id m2GAq2j0025081 for ; Sun, 16 Mar 2008 04:52:03 -0600 Received: from localhost (localhost.domini.in-nomine.org [127.0.0.1]) by nexus.in-nomine.org (Postfix) with ESMTP id 67DBFC12E; Sun, 16 Mar 2008 11:52:01 +0100 (CET) X-Virus-Scanned: by amavisd-new using ClamAV at in-nomine.org Received: from nexus.in-nomine.org ([127.0.0.1]) by localhost (nexus.domini.in-nomine.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4A4KmBTWxrD9; Sun, 16 Mar 2008 11:52:00 +0100 (CET) Received: by nexus.in-nomine.org (Postfix, from userid 1000) id 81123C11E; Sun, 16 Mar 2008 11:52:00 +0100 (CET) Date: Sun, 16 Mar 2008 11:52:00 +0100 From: Jeroen Ruigrok van der Werven To: Naz Gassiep Cc: cldr-users@unicode.org Subject: Re: Translations of subcountries. Message-ID: <20080316105200.GI60713@nexus.in-nomine.org> References: <47DCF554.9090607@mira.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <47DCF554.9090607@mira.net> Organisation: Ninth Circle Enterprises User-Agent: Mutt/1.5.17 (2007-11-01) X-archive-position: 405 X-ecartis-version: Ecartis v1.0.0 Sender: cldr-users-bounce@unicode.org Errors-to: cldr-users-bounce@unicode.org X-original-sender: asmodai@in-nomine.org Precedence: bulk X-list: cldr-users -On [20080316 11:37], Naz Gassiep (naz@mira.net) wrote: >Are there any plans to include translations of subcountries (states and >provinces) in the CLDR? Or is this just too big a job? How about city >names? I cannot speak for the CLDR project, but given that a small country like mine already has 12 provinces and knowing the amount of prefectures Japan has, I think adding such information to the CLDR and have it translated into all the present languages is going to bloat it enormously. Let alone city names. -- Jeroen Ruigrok van der Werven / asmodai イェルーン ラウフロック ヴァン デル ウェルヴェン http://www.in-nomine.org/ | http://www.rangaku.org/ Stand before it - there is no beginning. Follow it and there is no end. Stay with the Tao, move with the present... From verdy_p@wanadoo.fr Sun Mar 16 06:21:07 2008 Received: with ECARTIS (v1.0.0; list cldr-users); Sun, 16 Mar 2008 06:21:08 -0600 (CST) Received: from smtp2a.orange.fr (smtp2a.orange.fr [80.12.242.140]) by unicode.org (8.12.11/8.12.11) with ESMTP id m2GCL7nN023882 for ; Sun, 16 Mar 2008 06:21:07 -0600 Received: from me-wanadoo.net (localhost [127.0.0.1]) by mwinf2a16.orange.fr (SMTP Server) with ESMTP id A168B7000126; Sun, 16 Mar 2008 13:21:01 +0100 (CET) Received: from HARNON (APoitiers-258-1-137-175.w90-50.abo.wanadoo.fr [90.50.112.175]) by mwinf2a16.orange.fr (SMTP Server) with ESMTP id 0F78D7000125; Sun, 16 Mar 2008 13:21:01 +0100 (CET) X-ME-UUID: 20080316122101634.0F78D7000125@mwinf2a16.orange.fr Reply-To: From: "Philippe Verdy" To: "'Jeroen Ruigrok van der Werven'" , "'Naz Gassiep'" Cc: References: <47DCF554.9090607@mira.net> <20080316105200.GI60713@nexus.in-nomine.org> Subject: RE: Translations of subcountries. Date: Sun, 16 Mar 2008 13:20:56 +0100 Organization: Ordinateur Personnel Message-ID: <001101c88760$30087550$0a01a8c0@HARNON> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" X-Mailer: Microsoft Office Outlook 11 In-Reply-To: <20080316105200.GI60713@nexus.in-nomine.org> X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3198 Thread-Index: AciHVvUnDb8/xhWIS0GYtk307Z26ewAB0wTw Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by unicode.org id m2GCL7nN023882 X-archive-position: 406 X-ecartis-version: Ecartis v1.0.0 Sender: cldr-users-bounce@unicode.org Errors-to: cldr-users-bounce@unicode.org X-original-sender: verdy_p@wanadoo.fr Precedence: bulk X-list: cldr-users Jeroen Ruigrok van der Werven wrote: > -On [20080316 11:37], Naz Gassiep (naz@mira.net) wrote: > >Are there any plans to include translations of subcountries > (states and > >provinces) in the CLDR? Or is this just too big a job? How > about city > >names? > > I cannot speak for the CLDR project, but given that a small > country like mine already has 12 provinces and knowing the > amount of prefectures Japan has, I think adding such > information to the CLDR and have it translated into all the > present languages is going to bloat it enormously. Let alone > city names. At least this could be done for the few federal countries, where each state has its own local Constitution. This does not concern a lot of countries, but this is already an impressive list of states (or provinces, or Lndern or cantons or republics): USA, Canada, Mexico, Brasil, Germany, Swizerland, Russia, India, Australia. But then one could argue about the addition of other subdivisions, based on the fact that they also have a significant autonomy, like in Spain or UK, but then you won't find any clear delimitation about which country to subdivide. The only relevant list will then be the one found in ISO 3166-2, which is still not finished, but available at least in English, French, and a local language (possibly with several transcriptions or with romanization for English and French publication which does not always reflect the names used in French and English, but just translitterates the name used in the local language). Thne finding agreements about the orthography to use will be extremely difficult to reach, with no definitive source and differences found also between sources or during time, or inconsistencies also found in the same documents from the same sources and authors... From patrick.andries@xcential.com Sun Mar 16 11:21:35 2008 Received: with ECARTIS (v1.0.0; list cldr-users); Sun, 16 Mar 2008 11:21:36 -0600 (CST) Received: from skywalker.myinternetwebhost.com (skywalker.myinternetwebhost.com [69.90.236.45]) by unicode.org (8.12.11/8.12.11) with ESMTP id m2GHLYvu022549 for ; Sun, 16 Mar 2008 11:21:35 -0600 Received: from dsl-205-205-142-75.cooptel.qc.ca [205.205.142.75] by skywalker.myinternetwebhost.com with SMTP; Sun, 16 Mar 2008 10:24:33 -0700 Message-ID: <47DD56D8.9020000@xcential.com> Date: Sun, 16 Mar 2008 13:20:24 -0400 From: Patrick Andries User-Agent: Thunderbird 2.0.0.12 (Windows/20080213) MIME-Version: 1.0 To: Jeroen Ruigrok van der Werven CC: Naz Gassiep , cldr-users@unicode.org Subject: Re: Translations of subcountries. References: <47DCF554.9090607@mira.net> <20080316105200.GI60713@nexus.in-nomine.org> In-Reply-To: <20080316105200.GI60713@nexus.in-nomine.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-archive-position: 407 X-ecartis-version: Ecartis v1.0.0 Sender: cldr-users-bounce@unicode.org Errors-to: cldr-users-bounce@unicode.org X-original-sender: patrick.andries@xcential.com Precedence: bulk X-list: cldr-users Jeroen Ruigrok van der Werven a écrit : > -On [20080316 11:37], Naz Gassiep (naz@mira.net) wrote: > >> Are there any plans to include translations of subcountries (states and >> provinces) in the CLDR? Or is this just too big a job? How about city >> names? >> > > I cannot speak for the CLDR project, but given that a small country like > mine already has 12 provinces and knowing the amount of prefectures Japan > has, I think adding such information to the CLDR and have it translated into > all the present languages is going to bloat it enormously. Well, I would like to first hear why this would be necessary... I also don't know about translating in all languages, is that a policy of the CLDR? It could just be translated in a number of languages: those for which translations where found and vetted. For some it may just be a transcription, rather than a transliteration, of the native name, this would be the case in most Russian sub-entities I imagine except a few well-known ones like Moscow and Saint-Petersburg. Oh, and if the subentities were to be transliterated names for Russian for example, then I agree we don't need them ;-) P. A. From asmodai@in-nomine.org Sun Mar 16 12:10:55 2008 Received: with ECARTIS (v1.0.0; list cldr-users); Sun, 16 Mar 2008 12:10:55 -0600 (CST) Received: from nexus.in-nomine.org (dhammapada.xs4all.nl [82.95.168.248]) by unicode.org (8.12.11/8.12.11) with ESMTP id m2GIArm8029669 for ; Sun, 16 Mar 2008 12:10:54 -0600 Received: from localhost (localhost.domini.in-nomine.org [127.0.0.1]) by nexus.in-nomine.org (Postfix) with ESMTP id 2BE94C111; Sun, 16 Mar 2008 19:10:52 +0100 (CET) X-Virus-Scanned: by amavisd-new using ClamAV at in-nomine.org Received: from nexus.in-nomine.org ([127.0.0.1]) by localhost (nexus.domini.in-nomine.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 2u2geIfmamOu; Sun, 16 Mar 2008 19:10:51 +0100 (CET) Received: by nexus.in-nomine.org (Postfix, from userid 1000) id 18888C13F; Sun, 16 Mar 2008 19:10:51 +0100 (CET) Date: Sun, 16 Mar 2008 19:10:51 +0100 From: Jeroen Ruigrok van der Werven To: Patrick Andries Cc: Naz Gassiep , cldr-users@unicode.org Subject: Re: Translations of subcountries. Message-ID: <20080316181051.GJ60713@nexus.in-nomine.org> References: <47DCF554.9090607@mira.net> <20080316105200.GI60713@nexus.in-nomine.org> <47DD56D8.9020000@xcential.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <47DD56D8.9020000@xcential.com> Organisation: Ninth Circle Enterprises User-Agent: Mutt/1.5.17 (2007-11-01) X-archive-position: 408 X-ecartis-version: Ecartis v1.0.0 Sender: cldr-users-bounce@unicode.org Errors-to: cldr-users-bounce@unicode.org X-original-sender: asmodai@in-nomine.org Precedence: bulk X-list: cldr-users -On [20080316 18:29], Patrick Andries (patrick.andries@xcential.com) wrote: >I also don't know about translating in all languages, is that a policy of >the CLDR? It could just be translated in a number of languages: those for >which translations where found and vetted. Whatever the country in question uses. In Dutch we already say Moskou for what in English is known as Moscow, in Russian as Москва́, in Japanese as モスクワ, in Chinese as 莫斯科, and so on. A few of those cities and you already have 100+ localized names times the number of cities. The point is, where do you draw the line? With countries it is relatively easy, but with cities you get into the realm of subjectivity of what is important and well-known outside the capitals. Right now you have between 193-245 countries plus their capitals, so that could theoretically add in-between (193-245) - (19300-24500) lines of localized names just for the capitals. -- Jeroen Ruigrok van der Werven / asmodai イェルーン ラウフロック ヴァン デル ウェルヴェン http://www.in-nomine.org/ | http://www.rangaku.org/ Tattva, achintya bheda abheda tattva... From patrick.andries@xcential.com Sun Mar 16 12:56:17 2008 Received: with ECARTIS (v1.0.0; list cldr-users); Sun, 16 Mar 2008 12:56:18 -0600 (CST) Received: from skywalker.myinternetwebhost.com (skywalker.myinternetwebhost.com [69.90.236.45]) by unicode.org (8.12.11/8.12.11) with ESMTP id m2GIuGni023196 for ; Sun, 16 Mar 2008 12:56:17 -0600 Received: from dsl-205-205-142-75.cooptel.qc.ca [205.205.142.75] by skywalker.myinternetwebhost.com with SMTP; Sun, 16 Mar 2008 11:59:43 -0700 Message-ID: <47DD6D2C.3080802@xcential.com> Date: Sun, 16 Mar 2008 14:55:40 -0400 From: Patrick Andries User-Agent: Thunderbird 2.0.0.12 (Windows/20080213) MIME-Version: 1.0 To: Jeroen Ruigrok van der Werven CC: Naz Gassiep , cldr-users@unicode.org Subject: Re: Translations of subcountries. References: <47DCF554.9090607@mira.net> <20080316105200.GI60713@nexus.in-nomine.org> <47DD56D8.9020000@xcential.com> <20080316181051.GJ60713@nexus.in-nomine.org> In-Reply-To: <20080316181051.GJ60713@nexus.in-nomine.org> Content-Type: multipart/alternative; boundary="------------000505000300070503060604" X-archive-position: 409 X-ecartis-version: Ecartis v1.0.0 Sender: cldr-users-bounce@unicode.org Errors-to: cldr-users-bounce@unicode.org X-original-sender: patrick.andries@xcential.com Precedence: bulk X-list: cldr-users This is a multi-part message in MIME format. --------------000505000300070503060604 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Jeroen Ruigrok van der Werven a écrit : > -On [20080316 18:29], Patrick Andries (patrick.andries@xcential.com) wrote: > >> I also don't know about translating in all languages, is that a policy of >> the CLDR? It could just be translated in a number of languages: those for >> which translations where found and vetted. >> > > Whatever the country in question uses. > > In Dutch we already say Moskou for what in English is known as Moscow, in > Russian as Москва́, in Japanese as モスクワ, in Chinese as 莫斯科, and so on. > > A few of those cities and you already have 100+ localized names times the > number of cities. > > The point is, where do you draw the line? Well, that was my first question : what is the need for this ? P. A. --------------000505000300070503060604 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 8bit Jeroen Ruigrok van der Werven a écrit :
-On [20080316 18:29], Patrick Andries (patrick.andries@xcential.com) wrote:
  
I also don't know about translating in all languages, is that a policy of 
the CLDR? It could just be translated in a number of languages: those for 
which translations where found and vetted.
    

Whatever the country in question uses.

In Dutch we already say Moskou for what in English is known as Moscow, in
Russian as Москва́, in Japanese as モスクワ, in Chinese as 莫斯科, and so on.

A few of those cities and you already have 100+ localized names times the
number of cities.

The point is, where do you draw the line? 

Well, that was my first question : what is the need for this ?

P. A.

--------------000505000300070503060604-- From asmodai@in-nomine.org Sun Mar 16 13:02:28 2008 Received: with ECARTIS (v1.0.0; list cldr-users); Sun, 16 Mar 2008 13:02:28 -0600 (CST) Received: from nexus.in-nomine.org (dhammapada.xs4all.nl [82.95.168.248]) by unicode.org (8.12.11/8.12.11) with ESMTP id m2GJ2Rp4027709 for ; Sun, 16 Mar 2008 13:02:28 -0600 Received: from localhost (localhost.domini.in-nomine.org [127.0.0.1]) by nexus.in-nomine.org (Postfix) with ESMTP id 395BDC12E; Sun, 16 Mar 2008 20:02:27 +0100 (CET) X-Virus-Scanned: by amavisd-new using ClamAV at in-nomine.org Received: from nexus.in-nomine.org ([127.0.0.1]) by localhost (nexus.domini.in-nomine.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id CD-uib1ddqoH; Sun, 16 Mar 2008 20:02:26 +0100 (CET) Received: by nexus.in-nomine.org (Postfix, from userid 1000) id 603E7C11E; Sun, 16 Mar 2008 20:02:26 +0100 (CET) Date: Sun, 16 Mar 2008 20:02:26 +0100 From: Jeroen Ruigrok van der Werven To: Patrick Andries Cc: Naz Gassiep , cldr-users@unicode.org Subject: Re: Translations of subcountries. Message-ID: <20080316190226.GK60713@nexus.in-nomine.org> References: <47DCF554.9090607@mira.net> <20080316105200.GI60713@nexus.in-nomine.org> <47DD56D8.9020000@xcential.com> <20080316181051.GJ60713@nexus.in-nomine.org> <47DD6D2C.3080802@xcential.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <47DD6D2C.3080802@xcential.com> Organisation: Ninth Circle Enterprises User-Agent: Mutt/1.5.17 (2007-11-01) X-archive-position: 410 X-ecartis-version: Ecartis v1.0.0 Sender: cldr-users-bounce@unicode.org Errors-to: cldr-users-bounce@unicode.org X-original-sender: asmodai@in-nomine.org Precedence: bulk X-list: cldr-users -On [20080316 19:56], Patrick Andries (patrick.andries@xcential.com) wrote: >Well, that was my first question : what is the need for this ? From the point of a central repository of localized information for software I can imagine that such information is very valuable. The question would still remain if it makes sense for the core of CLDR. But yes, I wonder what Naz' reason would be. -- Jeroen Ruigrok van der Werven / asmodai イェルーン ラウフロック ヴァン デル ウェルヴェン http://www.in-nomine.org/ | http://www.rangaku.org/ All conditioned things are impermanent. Work out your own salvation with diligence... From aaron@ijigg.com Sun Mar 16 16:10:24 2008 Received: with ECARTIS (v1.0.0; list cldr-users); Sun, 16 Mar 2008 16:10:24 -0600 (CST) Received: from fg-out-1718.google.com (fg-out-1718.google.com [72.14.220.157]) by unicode.org (8.12.11/8.12.11) with ESMTP id m2GMANRS006983 for ; Sun, 16 Mar 2008 16:10:24 -0600 Received: by fg-out-1718.google.com with SMTP id 13so1037708fge.9 for ; Sun, 16 Mar 2008 15:10:23 -0700 (PDT) Received: by 10.86.96.18 with SMTP id t18mr13074157fgb.13.1205705423124; Sun, 16 Mar 2008 15:10:23 -0700 (PDT) Received: by 10.86.53.15 with HTTP; Sun, 16 Mar 2008 15:10:23 -0700 (PDT) Message-ID: <756ec90c0803161510s1fdbdb79hcc0a25a7e02c5698@mail.gmail.com> Date: Sun, 16 Mar 2008 15:10:23 -0700 From: "Aaron Brick" Subject: Re: Translations of subcountries. Cc: "Naz Gassiep" , cldr-users@unicode.org In-Reply-To: <20080316105200.GI60713@nexus.in-nomine.org> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_4354_12363469.1205705423111" References: <47DCF554.9090607@mira.net> <20080316105200.GI60713@nexus.in-nomine.org> X-archive-position: 411 X-ecartis-version: Ecartis v1.0.0 Sender: cldr-users-bounce@unicode.org Errors-to: cldr-users-bounce@unicode.org X-original-sender: aaron@ijigg.com Precedence: bulk X-list: cldr-users ------=_Part_4354_12363469.1205705423111 Content-Type: text/plain; charset=ISO-2022-JP Content-Transfer-Encoding: 7bit Content-Disposition: inline outside of the CLDR, many place names have been translated to many languages in the interwiki links on wikipedia. coverage is pretty good for major cities/regions and languages. aaron. On 3/16/08, Jeroen Ruigrok van der Werven wrote: > > -On [20080316 11:37], Naz Gassiep (naz@mira.net) wrote: > >Are there any plans to include translations of subcountries (states and > >provinces) in the CLDR? Or is this just too big a job? How about city > >names? > > > I cannot speak for the CLDR project, but given that a small country like > mine already has 12 provinces and knowing the amount of prefectures Japan > has, I think adding such information to the CLDR and have it translated > into > all the present languages is going to bloat it enormously. Let alone city > names. > > > -- > Jeroen Ruigrok van der Werven / asmodai > $B%$%'%k!<%s(B $B%i%&%U%m%C%/(B $B%t%!%s(B $B%G%k(B $B%&%'%k%t%'%s(B > http://www.in-nomine.org/ | http://www.rangaku.org/ > Stand before it - there is no beginning. Follow it and there is no end. > Stay with the Tao, move with the present... > > ------=_Part_4354_12363469.1205705423111 Content-Type: text/html; charset=ISO-2022-JP Content-Transfer-Encoding: 7bit Content-Disposition: inline outside of the CLDR, many place names have been translated to many languages in the interwiki links on wikipedia. coverage is pretty good for major cities/regions and languages.

aaron.


On 3/16/08, Jeroen Ruigrok van der Werven <asmodai@in-nomine.org> wrote:
-On [20080316 11:37], Naz Gassiep (naz@mira.net) wrote:
>Are there any plans to include translations of subcountries (states and
>provinces) in the CLDR? Or is this just too big a job? How about city
>names?


I cannot speak for the CLDR project, but given that a small country like
mine already has 12 provinces and knowing the amount of prefectures Japan
has, I think adding such information to the CLDR and have it translated into
all the present languages is going to bloat it enormously. Let alone city
names.


--
Jeroen Ruigrok van der Werven <asmodai(-at-)in-nomine.org> / asmodai
$B%$%'%k!<%s(B $B%i%&%U%m%C%/(B $B%t%!%s(B $B%G%k(B $B%&%'%k%t%'%s(B
http://www.in-nomine.org/ | http://www.rangaku.org/
Stand before it - there is no beginning. Follow it and there is no end.
Stay with the Tao, move with the present...


------=_Part_4354_12363469.1205705423111-- From mark.edward.davis@gmail.com Sun Mar 16 16:44:55 2008 Received: with ECARTIS (v1.0.0; list cldr-users); Sun, 16 Mar 2008 16:44:56 -0600 (CST) Received: from wr-out-0506.google.com (wr-out-0506.google.com [64.233.184.230]) by unicode.org (8.12.11/8.12.11) with ESMTP id m2GMitO1023568 for ; Sun, 16 Mar 2008 16:44:55 -0600 Received: by wr-out-0506.google.com with SMTP id 68so3414987wri.15 for ; Sun, 16 Mar 2008 15:44:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version:content-type:references:x-google-sender-auth; bh=51VIH8A/VFGc7RcmumDrs34A68/taYEO2rvYfEzcVoc=; b=d7EKa8mO3pfJ8hN8b1xp12bOICVEAVupigZy9rNni8XVeXav594IgVJR/SXdwh0u5LSx1FBFfjGIsZYL97drHP548OeiRh/s4bL4gOwHT5UqluuTo1DFMbe8/OtstDa4SDXSYIqOmBasCIrbAaN3Aav0l9oKdI7zK2DxT0dX1YA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version:content-type:references:x-google-sender-auth; b=OUNSp7vjZA0TNv1rShZz/Q180zK99jsPXkimh3K5UrDZ226C+9LpKoPrFveK00fQHfw3oKv9h3sJTDNQpJH0ds0DAOUix3tpa5pEaVVwm0V4+319Q9eBeMDC1ymG/eZWkaMytkZMNo+Gg3eBeNt+N+P3NIIElddcTR0iq8mw6M4= Received: by 10.150.98.18 with SMTP id v18mr7328076ybb.10.1205707494795; Sun, 16 Mar 2008 15:44:54 -0700 (PDT) Received: by 10.150.229.9 with HTTP; Sun, 16 Mar 2008 15:44:54 -0700 (PDT) Message-ID: <30b660a20803161544y1c2f6e99rd039dcf87d75a989@mail.gmail.com> Date: Sun, 16 Mar 2008 15:44:54 -0700 From: "Mark Davis" To: "Patrick Andries" Subject: Re: Translations of subcountries. Cc: "Jeroen Ruigrok van der Werven" , "Naz Gassiep" , cldr-users@unicode.org, "CLDR list" In-Reply-To: <47DD56D8.9020000@xcential.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_4751_31744628.1205707494808" References: <47DCF554.9090607@mira.net> <20080316105200.GI60713@nexus.in-nomine.org> <47DD56D8.9020000@xcential.com> X-Google-Sender-Auth: 1927903c1ae7b1e5 X-archive-position: 412 X-ecartis-version: Ecartis v1.0.0 Sender: cldr-users-bounce@unicode.org Errors-to: cldr-users-bounce@unicode.org X-original-sender: mark.davis@icu-project.org Precedence: bulk X-list: cldr-users ------=_Part_4751_31744628.1205707494808 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: base64 Content-Disposition: inline V2UndmUgY29uc2lkZXJlZCBhZGRpbmcgdGhlIElTTyBzdWJkaXZpc2lvbnMgdG8gQ0xEUiwgYnV0 IGl0IGlzIG5vdCBiZWluZwpjb25zaWRlcmVkIGZvciB0aGlzIHJlbGVhc2UuIFNlZSBidWcgMTUy OSwgYW5kIGZpbGUKaHR0cDovL3VuaWNvZGUub3JnL2NsZHIvZGF0YS9kb2NzL2F0dGFjaG1lbnRz L3N1YmRpdmlzaW9uX2hpZXJhcmNoeS54bWwuCgpNYXJrCgpPbiBTdW4sIE1hciAxNiwgMjAwOCBh dCAxMDoyMCBBTSwgUGF0cmljayBBbmRyaWVzIDwKcGF0cmljay5hbmRyaWVzQHhjZW50aWFsLmNv bT4gd3JvdGU6Cgo+IEplcm9lbiBSdWlncm9rIHZhbiBkZXIgV2VydmVuIGEgw6ljcml0IDoKPiA+ IC1PbiBbMjAwODAzMTYgMTE6MzddLCBOYXogR2Fzc2llcCAobmF6QG1pcmEubmV0KSB3cm90ZToK PiA+Cj4gPj4gQXJlIHRoZXJlIGFueSBwbGFucyB0byBpbmNsdWRlIHRyYW5zbGF0aW9ucyBvZiBz dWJjb3VudHJpZXMgKHN0YXRlcyBhbmQKPiA+PiBwcm92aW5jZXMpIGluIHRoZSBDTERSPyBPciBp cyB0aGlzIGp1c3QgdG9vIGJpZyBhIGpvYj8gSG93IGFib3V0IGNpdHkKPiA+PiBuYW1lcz8KPiA+ Pgo+ID4KPiA+IEkgY2Fubm90IHNwZWFrIGZvciB0aGUgQ0xEUiBwcm9qZWN0LCBidXQgZ2l2ZW4g dGhhdCBhIHNtYWxsIGNvdW50cnkgbGlrZQo+ID4gbWluZSBhbHJlYWR5IGhhcyAxMiBwcm92aW5j ZXMgYW5kIGtub3dpbmcgdGhlIGFtb3VudCBvZiBwcmVmZWN0dXJlcwo+IEphcGFuCj4gPiBoYXMs IEkgdGhpbmsgYWRkaW5nIHN1Y2ggaW5mb3JtYXRpb24gdG8gdGhlIENMRFIgYW5kIGhhdmUgaXQg dHJhbnNsYXRlZAo+IGludG8KPiA+IGFsbCB0aGUgcHJlc2VudCBsYW5ndWFnZXMgaXMgZ29pbmcg dG8gYmxvYXQgaXQgZW5vcm1vdXNseS4KPgo+IFdlbGwsIEkgd291bGQgbGlrZSB0byBmaXJzdCBo ZWFyIHdoeSB0aGlzIHdvdWxkIGJlIG5lY2Vzc2FyeS4uLgo+Cj4gSSBhbHNvIGRvbid0IGtub3cg YWJvdXQgdHJhbnNsYXRpbmcgaW4gYWxsIGxhbmd1YWdlcywgaXMgdGhhdCBhIHBvbGljeQo+IG9m IHRoZSBDTERSPyBJdCBjb3VsZCBqdXN0IGJlIHRyYW5zbGF0ZWQgaW4gYSBudW1iZXIgb2YgbGFu Z3VhZ2VzOiB0aG9zZQo+IGZvciB3aGljaCB0cmFuc2xhdGlvbnMgd2hlcmUgZm91bmQgYW5kIHZl dHRlZC4gRm9yIHNvbWUgaXQgbWF5IGp1c3QgYmUgYQo+IHRyYW5zY3JpcHRpb24sIHJhdGhlciB0 aGFuIGEgdHJhbnNsaXRlcmF0aW9uLCBvZiB0aGUgbmF0aXZlIG5hbWUsIHRoaXMKPiB3b3VsZCBi ZSB0aGUgY2FzZSBpbiBtb3N0IFJ1c3NpYW4gc3ViLWVudGl0aWVzIEkgaW1hZ2luZSBleGNlcHQg YSBmZXcKPiB3ZWxsLWtub3duIG9uZXMgbGlrZSBNb3Njb3cgYW5kIFNhaW50LVBldGVyc2J1cmcu IE9oLCBhbmQgaWYgdGhlCj4gc3ViZW50aXRpZXMgd2VyZSB0byBiZSB0cmFuc2xpdGVyYXRlZCBu YW1lcyBmb3IgUnVzc2lhbiBmb3IgZXhhbXBsZSwKPiB0aGVuIEkgYWdyZWUgd2UgZG9uJ3QgbmVl ZCB0aGVtIDstKQo+Cj4gUC4gQS4KPgo+Cj4KPgo+Cj4KCgotLSAKTWFyawo= ------=_Part_4751_31744628.1205707494808 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: base64 Content-Disposition: inline V2UmIzM5O3ZlIGNvbnNpZGVyZWQgYWRkaW5nIHRoZSBJU08gc3ViZGl2aXNpb25zIHRvIENMRFIs IGJ1dCBpdCBpcyBub3QgYmVpbmcgY29uc2lkZXJlZCBmb3IgdGhpcyByZWxlYXNlLiBTZWUgYnVn IDE1MjksIGFuZCBmaWxlIDxhIGhyZWY9Imh0dHA6Ly91bmljb2RlLm9yZy9jbGRyL2RhdGEvZG9j cy9hdHRhY2htZW50cy9zdWJkaXZpc2lvbl9oaWVyYXJjaHkueG1sIj5odHRwOi8vdW5pY29kZS5v cmcvY2xkci9kYXRhL2RvY3MvYXR0YWNobWVudHMvc3ViZGl2aXNpb25faGllcmFyY2h5LnhtbDwv YT4uPGJyPgo8YnI+TWFyazxicj48YnI+PGRpdiBjbGFzcz0iZ21haWxfcXVvdGUiPk9uIFN1biwg TWFyIDE2LCAyMDA4IGF0IDEwOjIwIEFNLCBQYXRyaWNrIEFuZHJpZXMgJmx0OzxhIGhyZWY9Im1h aWx0bzpwYXRyaWNrLmFuZHJpZXNAeGNlbnRpYWwuY29tIj5wYXRyaWNrLmFuZHJpZXNAeGNlbnRp YWwuY29tPC9hPiZndDsgd3JvdGU6PGJyPjxibG9ja3F1b3RlIGNsYXNzPSJnbWFpbF9xdW90ZSIg c3R5bGU9ImJvcmRlci1sZWZ0OiAxcHggc29saWQgcmdiKDIwNCwgMjA0LCAyMDQpOyBtYXJnaW46 IDBwdCAwcHQgMHB0IDAuOGV4OyBwYWRkaW5nLWxlZnQ6IDFleDsiPgpKZXJvZW4gUnVpZ3JvayB2 YW4gZGVyIFdlcnZlbiBhIMOpY3JpdCA6PGJyPgo8ZGl2IGNsYXNzPSJJaDJFM2QiPiZndDsgLU9u IFsyMDA4MDMxNiAxMTozN10sIE5heiBHYXNzaWVwICg8YSBocmVmPSJtYWlsdG86bmF6QG1pcmEu bmV0Ij5uYXpAbWlyYS5uZXQ8L2E+KSB3cm90ZTo8YnI+CiZndDs8YnI+CiZndDsmZ3Q7IEFyZSB0 aGVyZSBhbnkgcGxhbnMgdG8gaW5jbHVkZSB0cmFuc2xhdGlvbnMgb2Ygc3ViY291bnRyaWVzIChz dGF0ZXMgYW5kPGJyPgomZ3Q7Jmd0OyBwcm92aW5jZXMpIGluIHRoZSBDTERSPyBPciBpcyB0aGlz IGp1c3QgdG9vIGJpZyBhIGpvYj8gSG93IGFib3V0IGNpdHk8YnI+CiZndDsmZ3Q7IG5hbWVzPzxi cj4KJmd0OyZndDs8YnI+CiZndDs8YnI+CiZndDsgSSBjYW5ub3Qgc3BlYWsgZm9yIHRoZSBDTERS IHByb2plY3QsIGJ1dCBnaXZlbiB0aGF0IGEgc21hbGwgY291bnRyeSBsaWtlPGJyPgomZ3Q7IG1p bmUgYWxyZWFkeSBoYXMgMTIgcHJvdmluY2VzIGFuZCBrbm93aW5nIHRoZSBhbW91bnQgb2YgcHJl ZmVjdHVyZXMgSmFwYW48YnI+CiZndDsgaGFzLCBJIHRoaW5rIGFkZGluZyBzdWNoIGluZm9ybWF0 aW9uIHRvIHRoZSBDTERSIGFuZCBoYXZlIGl0IHRyYW5zbGF0ZWQgaW50bzxicj4KJmd0OyBhbGwg dGhlIHByZXNlbnQgbGFuZ3VhZ2VzIGlzIGdvaW5nIHRvIGJsb2F0IGl0IGVub3Jtb3VzbHkuPGJy Pgo8YnI+CjwvZGl2PldlbGwsIEkgd291bGQgbGlrZSB0byBmaXJzdCBoZWFyIHdoeSB0aGlzIHdv dWxkIGJlIG5lY2Vzc2FyeS4uLjxicj4KPGJyPgpJIGFsc28gZG9uJiMzOTt0IGtub3cgYWJvdXQg dHJhbnNsYXRpbmcgaW4gYWxsIGxhbmd1YWdlcywgaXMgdGhhdCBhIHBvbGljeTxicj4Kb2YgdGhl IENMRFI/IEl0IGNvdWxkIGp1c3QgYmUgdHJhbnNsYXRlZCBpbiBhIG51bWJlciBvZiBsYW5ndWFn ZXM6IHRob3NlPGJyPgpmb3Igd2hpY2ggdHJhbnNsYXRpb25zIHdoZXJlIGZvdW5kIGFuZCB2ZXR0 ZWQuIEZvciBzb21lIGl0IG1heSBqdXN0IGJlIGE8YnI+CnRyYW5zY3JpcHRpb24sIHJhdGhlciB0 aGFuIGEgdHJhbnNsaXRlcmF0aW9uLCBvZiB0aGUgbmF0aXZlIG5hbWUsIHRoaXM8YnI+CndvdWxk IGJlIHRoZSBjYXNlIGluIG1vc3QgUnVzc2lhbiBzdWItZW50aXRpZXMgSSBpbWFnaW5lIGV4Y2Vw dCBhIGZldzxicj4Kd2VsbC1rbm93biBvbmVzIGxpa2UgTW9zY293IGFuZCBTYWludC1QZXRlcnNi dXJnLiBPaCwgYW5kIGlmIHRoZTxicj4Kc3ViZW50aXRpZXMgd2VyZSB0byBiZSB0cmFuc2xpdGVy YXRlZCBuYW1lcyBmb3IgUnVzc2lhbiBmb3IgZXhhbXBsZSw8YnI+CnRoZW4gSSBhZ3JlZSB3ZSBk b24mIzM5O3QgbmVlZCB0aGVtIDstKTxicj4KPGZvbnQgY29sb3I9IiM4ODg4ODgiPjxicj4KUC4g QS48YnI+Cjxicj4KPGJyPgo8YnI+Cjxicj4KPGJyPgo8L2ZvbnQ+PC9ibG9ja3F1b3RlPjwvZGl2 Pjxicj48YnIgY2xlYXI9ImFsbCI+PGJyPi0tIDxicj5NYXJrCg== ------=_Part_4751_31744628.1205707494808-- From naz@mira.net Tue Mar 18 00:10:55 2008 Received: with ECARTIS (v1.0.0; list cldr-users); Tue, 18 Mar 2008 00:10:55 -0600 (CST) Received: from smtp-auth.no-ip.com (smtp-auth.no-ip.com [204.16.252.95]) by unicode.org (8.12.11/8.12.11) with ESMTP id m2I6AsX5013154 for ; Tue, 18 Mar 2008 00:10:55 -0600 X-No-IP: mrnaz.com@noip-smtp X-Report-Spam-To: abuse@no-ip.com Received: from [192.168.0.21] (ppp121-44-226-239.lns2.mel4.internode.on.net [121.44.226.239]) (using TLSv1 with cipher RC4-MD5 (128/128 bits)) (No client certificate requested) (Authenticated sender: mrnaz.com@noip-smtp) by smtp-auth.no-ip.com (Postfix) with ESMTP id B33C8BCA8; Mon, 17 Mar 2008 23:10:48 -0700 (PDT) Message-ID: <47DF5CE6.3050808@mira.net> Date: Tue, 18 Mar 2008 17:10:46 +1100 From: Naz Gassiep User-Agent: Thunderbird 2.0.0.12 (Windows/20080213) MIME-Version: 1.0 To: Mark Davis CC: cldr-users@unicode.org Subject: Re: Translations of subcountries. References: <47DCF554.9090607@mira.net> <20080316105200.GI60713@nexus.in-nomine.org> <47DD56D8.9020000@xcential.com> <30b660a20803161544y1c2f6e99rd039dcf87d75a989@mail.gmail.com> In-Reply-To: <30b660a20803161544y1c2f6e99rd039dcf87d75a989@mail.gmail.com> Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 8bit X-archive-position: 413 X-ecartis-version: Ecartis v1.0.0 Sender: cldr-users-bounce@unicode.org Errors-to: cldr-users-bounce@unicode.org X-original-sender: naz@mira.net Precedence: bulk X-list: cldr-users Interesting, the structure of that XML file seems to me to be suboptimal. IMHO, if the CLDR is going to follow ISO 3166-2 then there should be no deviation from that standard as far as IDs are concerned, as the hierarchy already ensures that there are no ambiguities. Is this the final structure, or is this still open to discussion?
- Naz.

Mark Davis wrote:
We've considered adding the ISO subdivisions to CLDR, but it is not being considered for this release. See bug 1529, and file http://unicode.org/cldr/data/docs/attachments/subdivision_hierarchy.xml.

Mark

On Sun, Mar 16, 2008 at 10:20 AM, Patrick Andries <patrick.andries@xcential.com> wrote:
Jeroen Ruigrok van der Werven a écrit :
> -On [20080316 11:37], Naz Gassiep (naz@mira.net) wrote:
>
>> Are there any plans to include translations of subcountries (states and
>> provinces) in the CLDR? Or is this just too big a job? How about city
>> names?
>>
>
> I cannot speak for the CLDR project, but given that a small country like
> mine already has 12 provinces and knowing the amount of prefectures Japan
> has, I think adding such information to the CLDR and have it translated into
> all the present languages is going to bloat it enormously.

Well, I would like to first hear why this would be necessary...

I also don't know about translating in all languages, is that a policy
of the CLDR? It could just be translated in a number of languages: those
for which translations where found and vetted. For some it may just be a
transcription, rather than a transliteration, of the native name, this
would be the case in most Russian sub-entities I imagine except a few
well-known ones like Moscow and Saint-Petersburg. Oh, and if the
subentities were to be transliterated names for Russian for example,
then I agree we don't need them ;-)

P. A.








--
Mark
From naz@mira.net Tue Mar 18 08:45:23 2008 Received: with ECARTIS (v1.0.0; list cldr-users); Tue, 18 Mar 2008 08:45:23 -0600 (CST) Received: from smtp-auth.no-ip.com (smtp-auth.no-ip.com [204.16.252.95]) by unicode.org (8.12.11/8.12.11) with ESMTP id m2IEjMDh001573 for ; Tue, 18 Mar 2008 08:45:23 -0600 X-No-IP: mrnaz.com@noip-smtp X-Report-Spam-To: abuse@no-ip.com Received: from [192.168.0.21] (ppp121-44-226-239.lns2.mel4.internode.on.net [121.44.226.239]) (using TLSv1 with cipher RC4-MD5 (128/128 bits)) (No client certificate requested) (Authenticated sender: mrnaz.com@noip-smtp) by smtp-auth.no-ip.com (Postfix) with ESMTP id DC102BF07; Tue, 18 Mar 2008 07:45:16 -0700 (PDT) Message-ID: <47DFD57A.20808@mira.net> Date: Wed, 19 Mar 2008 01:45:14 +1100 From: Naz Gassiep User-Agent: Thunderbird 2.0.0.12 (Windows/20080213) MIME-Version: 1.0 CC: Patrick Andries , cldr-users@unicode.org Subject: Re: Translations of subcountries. References: <47DCF554.9090607@mira.net> <20080316105200.GI60713@nexus.in-nomine.org> <47DD56D8.9020000@xcential.com> <20080316181051.GJ60713@nexus.in-nomine.org> <47DD6D2C.3080802@xcential.com> <20080316190226.GK60713@nexus.in-nomine.org> In-Reply-To: <20080316190226.GK60713@nexus.in-nomine.org> Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 7bit X-archive-position: 414 X-ecartis-version: Ecartis v1.0.0 Sender: cldr-users-bounce@unicode.org Errors-to: cldr-users-bounce@unicode.org X-original-sender: naz@mira.net Precedence: bulk X-list: cldr-users There are many i18n issues that the CLDR doesn't cover that I feel it shouldn't. However, I am unable to think of a use case where country names would be required, and subdivision names would not at least be highly relevant. Any storage of addresses requires more than the country. For example, I use the ISO-3166-1 and ISO 3166-2 lists in my applications to allow discrete specification of country and subdivision. At the moment, a user entering, for example, an address in Arabic, the country would be displayed in Arabic, the street, suburb and city names would be entered in Arabic, the postcode would use Arabic numerals and only the subcountry would be in English. This (to me) is a suboptimal situation. There is a very narrow use case where an app would need translation of countries, and not subdivisions.

A list of city translations is a different matter altogether, given that a) it's a far bigger task, b) it's a far more subjective task, c) there is no established standard for city lists, finished or not and d) there is far less of a use-case for discrete city data. I will re-raise this issue in a later post as I don't want to cloud the subcountry question.

My view is that the CLDR could follow the ISO-3166-2 specification, translating only those names that are in the spec, and leaving the others out until such time as they have been included.

I also acknowledge that the 3166-2 spec changes the key for each subdivision from time to time, as more appropriate abbreviations are found (based on local conventions that come to light etc). This may prove to be a challenge, however I think that use of "provisional", "draft" and "alt" indicators provide an adequate mechanism for handling this uncertainty in a forwards and backwards compatible manner.

Regards,
- Naz.

Jeroen Ruigrok van der Werven wrote:
-On [20080316 19:56], Patrick Andries (patrick.andries@xcential.com) wrote:
  
Well, that was my first question : what is the need for this ?
    

>From the point of a central repository of localized information for software
I can imagine that such information is very valuable.
The question would still remain if it makes sense for the core of CLDR.

But yes, I wonder what Naz' reason would be.

  
From ed.trager@gmail.com Tue Mar 18 12:35:39 2008 Received: with ECARTIS (v1.0.0; list cldr-users); Tue, 18 Mar 2008 12:35:39 -0600 (CST) Received: from ti-out-0910.google.com (ti-out-0910.google.com [209.85.142.189]) by unicode.org (8.12.11/8.12.11) with ESMTP id m2IIZbwK005339 for ; Tue, 18 Mar 2008 12:35:38 -0600 Received: by ti-out-0910.google.com with SMTP id 28so9232tif.11 for ; Tue, 18 Mar 2008 11:35:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; bh=LZuQD34rNpr/ML+G2OyKPPshRntUYyUO4jEZS6L8Sz4=; b=a3OBRJpkAwst++oZQUhgvs9RjbzYA2wzcKm4U9TV/6CWMHiEsemQXn8mXULr2o0QOq1bBiNu3ogKmO4SkSxT11xpP2jXFURHZsB8gjpsEg6ModAkShbusD6yiF1auinQqbxRHccUP1LIe3YI5mz76Hs/qvBZp0/69iNroNOwwuU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=message-id:date:from:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=QqC5TA63zPmPx0OIgzhRX91XeBO454WlCTEgPQeKgXMUILmJghLhWwablD2CVcGPAP8BdK5dBEcMswZMj75bm2LYVDhuJ7G3hxj3Xq7aZdG+7W22jF+2PHtN3CLNWQWYUXYob0mcKFVy2pHgLZOexlg9xgu9JSWg5SNGTL9JL5E= Received: by 10.150.122.13 with SMTP id u13mr1179799ybc.131.1205865333929; Tue, 18 Mar 2008 11:35:33 -0700 (PDT) Received: by 10.150.185.12 with HTTP; Tue, 18 Mar 2008 11:35:33 -0700 (PDT) Message-ID: <416e2cf10803181135o5befdf95yd1cfddb140bfc7e4@mail.gmail.com> Date: Tue, 18 Mar 2008 14:35:33 -0400 From: "Ed Trager" To: cldr-users@unicode.org Subject: Re: Translations of subcountries. In-Reply-To: <47DFD57A.20808@mira.net> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Disposition: inline References: <47DCF554.9090607@mira.net> <20080316105200.GI60713@nexus.in-nomine.org> <47DD56D8.9020000@xcential.com> <20080316181051.GJ60713@nexus.in-nomine.org> <47DD6D2C.3080802@xcential.com> <20080316190226.GK60713@nexus.in-nomine.org> <47DFD57A.20808@mira.net> Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by unicode.org id m2IIZbwK005339 X-archive-position: 415 X-ecartis-version: Ecartis v1.0.0 Sender: cldr-users-bounce@unicode.org Errors-to: cldr-users-bounce@unicode.org X-original-sender: ed.trager@gmail.com Precedence: bulk X-list: cldr-users Hi, All, Just for the sake of discussion, suppose that we compiled data where the "keys" were the names of regions/provinces/states/other subdivisions of countries written in the international phonetic alphabet (IPA). For example, the state of Illinois in the United States could be entered as /ˌɪlɨˈnɔɪ/. Mappings could be provided as is customarily done in message catalogs for all cases where conventional spellings or translations existed -- i.e., KEY=/ˌɪlɨˈnɔɪ/ maps to ENGLISH_VALUE="Illinois". (Actually, in this case, we can probably move up to the script level and just say KEY=/ˌɪlɨˈnɔɪ/ maps to LATIN_VALUE="Illinois"). We would do the same thing for other scripts where conventionalized spellings already exist, i.e., i.e., KEY=/ˌɪlɨˈnɔɪ/ maps to HANS_VALUE="伊利诺" (or, more correctly, "伊利诺州"). In cases where conventional spellings or translations are not common, one could theoretically write software to translate the IPA transcription "key" into reasonable phonetics in whatever script was desired, i.e., KEY=/ˌɪlɨˈnɔɪ/ gets transcribed into ARABIC_SCRIPT_VALUE="إلينوي " and into THAI_SCRIPT_VALUE="อิลินอย", etc. So we would have something like this psuedo code: if(value exists in message catalog){ return value; }else{ return getPhoneticTranscriptionFromKey(key); } Such computer-generated phonetic transcriptions might differ from human-generated transcriptions (for example, Thai people know that the "s" in "Illinois" is silent, so the conventional phonetic-based transcription for Illinois is "อิลลินอยส์" with a karan (consonant silencer) over the final consonant) -- but despite some shortcomings, this would nevertheless still be a great way to display geographic place names in software when conventional translations are lacking in message catalogs (or, likewise, in the CLDR). Of course, one would have to write code to transcribe between IPA and various scripts and languages, but that is a fixed or definable cost. Such a phonetic transcription system would be kind of fun to write. Obviously, such a system cannot be "perfect" because of the inherint phonetic representational limits of most written language orthographies. But it would not have to be perfect -- just good enough to generally represent the sound of a place name, more or less. Just an idea - Ed Trager On Tue, Mar 18, 2008 at 10:45 AM, Naz Gassiep wrote: > > There are many i18n issues that the CLDR doesn't cover that I feel it > shouldn't. However, I am unable to think of a use case where country names > would be required, and subdivision names would not at least be highly > relevant. Any storage of addresses requires more than the country. For > example, I use the ISO-3166-1 and ISO 3166-2 lists in my applications to > allow discrete specification of country and subdivision. At the moment, a > user entering, for example, an address in Arabic, the country would be > displayed in Arabic, the street, suburb and city names would be entered in > Arabic, the postcode would use Arabic numerals and only the subcountry would > be in English. This (to me) is a suboptimal situation. There is a very > narrow use case where an app would need translation of countries, and not > subdivisions. > > A list of city translations is a different matter altogether, given that a) > it's a far bigger task, b) it's a far more subjective task, c) there is no > established standard for city lists, finished or not and d) there is far > less of a use-case for discrete city data. I will re-raise this issue in a > later post as I don't want to cloud the subcountry question. > > My view is that the CLDR could follow the ISO-3166-2 specification, > translating only those names that are in the spec, and leaving the others > out until such time as they have been included. > > I also acknowledge that the 3166-2 spec changes the key for each > subdivision from time to time, as more appropriate abbreviations are found > (based on local conventions that come to light etc). This may prove to be a > challenge, however I think that use of "provisional", "draft" and "alt" > indicators provide an adequate mechanism for handling this uncertainty in a > forwards and backwards compatible manner. > > Regards, > - Naz. > > > Jeroen Ruigrok van der Werven wrote: > -On [20080316 19:56], Patrick Andries (patrick.andries@xcential.com) wrote: > > > Well, that was my first question : what is the need for this ? > > >From the point of a central repository of localized information for > software > I can imagine that such information is very valuable. > The question would still remain if it makes sense for the core of CLDR. > > But yes, I wonder what Naz' reason would be. > > > From mark.edward.davis@gmail.com Tue Mar 18 19:50:14 2008 Received: with ECARTIS (v1.0.0; list cldr-users); Tue, 18 Mar 2008 19:50:14 -0600 (CST) Received: from ag-out-0708.google.com (ag-out-0708.google.com [72.14.246.242]) by unicode.org (8.12.11/8.12.11) with ESMTP id m2J1oDeU025096 for ; Tue, 18 Mar 2008 19:50:14 -0600 Received: by ag-out-0708.google.com with SMTP id 35so265038aga.11 for ; Tue, 18 Mar 2008 18:50:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:sender:to:subject:cc:mime-version:content-type:x-google-sender-auth; bh=Ea+ISsDzKF7kMCyo9nTLik0iwlAVZ845BO/mwyIjArM=; b=DMzKy9tgKm1K2Q5AEqGoG7hNtwRjBxr/n8Tk6HH5QZVxM7Wo8udHjcnuimjjdog/u8Y0aQx5GXQtTP/kXPdfdnoRLRYMkxc8YHmW9FN2Nkia5XXKvv5tAYzBGR0CVNr5tWInNtSeeMuwC8qxG8Yux5WCf3XcukT7AFdO6c74bF0= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=message-id:date:from:sender:to:subject:cc:mime-version:content-type:x-google-sender-auth; b=ULfx04g9VF4sZa38IMVqxLyu0uipFm+mcoSWmPH0GvnAZGZPpZvZRAFgJAU5ZIybUzHXBnzdVW2eSQmZ5yQe9gvd7hSD14jm1uwV5u/2eJnQ9dOagtZF70oLhn/s8XE9TZ+5Jg0vMUlzc8vxezQvXtldqTAkkkzzIx5Ch7yWAFc= Received: by 10.151.148.2 with SMTP id a2mr1436699ybo.186.1205891413120; Tue, 18 Mar 2008 18:50:13 -0700 (PDT) Received: by 10.150.229.9 with HTTP; Tue, 18 Mar 2008 18:50:13 -0700 (PDT) Message-ID: <30b660a20803181850t5d72b81cvb089eae4d9d97bde@mail.gmail.com> Date: Tue, 18 Mar 2008 18:50:13 -0700 From: "Mark Davis" To: "CLDR list" Subject: fill-in fields Cc: "cldr-users@unicode.org" MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_12540_18798366.1205891413124" X-Google-Sender-Auth: 9272b803640f329f X-archive-position: 416 X-ecartis-version: Ecartis v1.0.0 Sender: cldr-users-bounce@unicode.org Errors-to: cldr-users-bounce@unicode.org X-original-sender: mark.davis@icu-project.org Precedence: bulk X-list: cldr-users ------=_Part_12540_18798366.1205891413124 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Content-Disposition: inline During today's telecon, there we discussed trying to do the following: Ideally, when you click into an entry field (or on its radio button), of a plural form, the text of the singular would appear so that it can be edited. I was thinking that a generalization of the above might make data entry easier in general. Namely, that when you click on an *empty* text entry field (or its radio button), then we always try to populate it with a guess which can then be edited further. That would save people on repetitive cutting and pasting. Here a suggestion for what that guess should be, based on going through the following list, and take the first that matches. 1. The "winning" value. 1. Example: if "Donnerstag" has the most votes for 'thursday', then clicking on the empty field will fill in "Donnerstag" 2. The singular form. - Example: if the value for 'hour' is "heure", then clicking on the entry field for 'hours' will insert "heure". 3. The parent's value (except where that parent is Root) - Example: if I'm in [de_CH] and there are no proposals for 'thursday', then clicking on the empty field will fill in "Donnerstag" from [de]. 4. Finally, if there is no other information, and it is the Latin script, then fill in the English value - Example: "Afghanistan" 5. Otherwise don't fill in anything. So, would this make things enough easier as to be worth doing, in general? And if so, does the above proposal need tweaking? -- Mark ------=_Part_12540_18798366.1205891413124 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 7bit Content-Disposition: inline During today's telecon, there we discussed trying to do the following:

Ideally, when you click into an entry field (or on its radio button), of a plural form, the text of the singular would appear so that it can be edited.


I was thinking that a generalization of the above might make data entry easier in general. Namely, that when you click on an empty text entry field (or its radio button), then we always try to populate it with a guess which can then be edited further. That would save people on repetitive cutting and pasting.

Here a suggestion for what that guess should be, based on going through the following list, and take the first that matches.
  1. The "winning" value.
    1. Example: if "Donnerstag" has the most votes for 'thursday', then clicking on the empty field will fill in "Donnerstag"
  2. The singular form.
    • Example: if the value for 'hour' is "heure", then clicking on the entry field for 'hours' will insert "heure".
  3. The parent's value (except where that parent is Root)
    • Example: if I'm in [de_CH] and there are no proposals for 'thursday', then clicking on the empty field will fill in "Donnerstag" from [de].
  4. Finally, if there is no other information, and it is the Latin script, then fill in the English value
    • Example: "Afghanistan"
  5. Otherwise don't fill in anything.
So, would this make things enough easier as to be worth doing, in general? And if so, does the above proposal need tweaking?

--
Mark ------=_Part_12540_18798366.1205891413124-- From eik@iki.fi Tue Mar 18 23:42:34 2008 Received: with ECARTIS (v1.0.0; list cldr-users); Tue, 18 Mar 2008 23:42:40 -0600 (CST) Received: from smtp6.pp.htv.fi (smtp6.pp.htv.fi [213.243.153.40]) by unicode.org (8.12.11/8.12.11) with ESMTP id m2J5gXJj001885; Tue, 18 Mar 2008 23:42:34 -0600 Received: from inspiron (cs181253188.pp.htv.fi [82.181.253.188]) by smtp6.pp.htv.fi (Postfix) with ESMTP id 4A41C5BC06B; Wed, 19 Mar 2008 07:42:32 +0200 (EET) Reply-To: From: "Erkki I. Kolehmainen" To: "'Mark Davis'" , "'CLDR list'" Cc: Subject: RE: fill-in fields Date: Wed, 19 Mar 2008 07:42:30 +0200 Message-ID: <000001c88984$063b0a00$0200a8c0@inspiron> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_0001_01C88994.C9C623F0" X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook, Build 10.0.6838 Importance: Normal Thread-Index: AciJZFQODsirpPC/Qzu6gvePRRZElAAHpjsg In-Reply-To: <30b660a20803181850t5d72b81cvb089eae4d9d97bde@mail.gmail.com> X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3198 X-archive-position: 417 X-ecartis-version: Ecartis v1.0.0 Sender: cldr-users-bounce@unicode.org Errors-to: cldr-users-bounce@unicode.org X-original-sender: eik@iki.fi Precedence: bulk X-list: cldr-users This is a multi-part message in MIME format. ------=_NextPart_000_0001_01C88994.C9C623F0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable I'm all for this, since it would make a slight modification much easier. = (If duplicates would end up being unnecessarily submitted, they'll be = stripped off in the end.) =20 (As discussed in the telecon, this is essentially a must for the plural expressions for the currencies, due to the large number of new entries). =20 Regards, Erkki Erkki I. Kolehmainen=20 -----Alkuper=E4inen viesti----- L=E4hett=E4j=E4: cldr-bounce@unicode.org = [mailto:cldr-bounce@unicode.org] Puolesta Mark Davis L=E4hetetty: 19. maaliskuuta 2008 3:50 Vastaanottaja: CLDR list Kopio: cldr-users@unicode.org Aihe: fill-in fields During today's telecon, there we discussed trying to do the following: Ideally, when you click into an entry field (or on its radio button), of = a plural form, the text of the singular would appear so that it can be = edited. I was thinking that a generalization of the above might make data entry easier in general. Namely, that when you click on an empty text entry = field (or its radio button), then we always try to populate it with a guess = which can then be edited further. That would save people on repetitive cutting = and pasting. Here a suggestion for what that guess should be, based on going through = the following list, and take the first that matches. 1. The "winning" value.=20 1. Example: if "Donnerstag" has the most votes for 'thursday', then clicking on the empty field will fill in "Donnerstag"=20 2. The singular form.=20 * Example: if the value for 'hour' is "heure", then clicking on the entry field for 'hours' will insert "heure". 3. The parent's value (except where that parent is Root) * Example: if I'm in [de_CH] and there are no proposals for 'thursday', then clicking on the empty field will fill in "Donnerstag" = from [de].=20 4. Finally, if there is no other information, and it is the Latin script, then fill in the English value=20 * Example: "Afghanistan"=20 5. Otherwise don't fill in anything. So, would this make things enough easier as to be worth doing, in = general? And if so, does the above proposal need tweaking? --=20 Mark=20 ------=_NextPart_000_0001_01C88994.C9C623F0 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Viesti
I'm=20 all for this, since it would make a slight modification much easier.=20 (If duplicates would end up being unnecessarily submitted, they'll = be=20 stripped off in the end.)
 
(As=20 discussed in the telecon, this is essentially a must for the plural = expressions=20 for the currencies, due to the large number of new = entries).
 
Regards, Erkki

Erkki I. Kolehmainen 

 -----Alkuper=E4inen=20 viesti-----
L=E4hett=E4j=E4: cldr-bounce@unicode.org=20 [mailto:cldr-bounce@unicode.org] Puolesta Mark = Davis
L=E4hetetty:=20 19. maaliskuuta 2008 3:50
Vastaanottaja: CLDR = list
Kopio:=20 cldr-users@unicode.org
Aihe: fill-in=20 fields

During today's telecon, there we = discussed trying to do the following:

Ideally, when you click into an entry = field (or=20 on its radio button), of a plural form, the text of the singular would = appear=20 so that it can be edited.


I was thinking that a=20 generalization of the above might make data entry easier in general. = Namely,=20 that when you click on an empty text entry field (or its radio = button),=20 then we always try to populate it with a guess which can then be = edited=20 further. That would save people on repetitive cutting and = pasting.

Here=20 a suggestion for what that guess should be, based on going through the = following list, and take the first that matches.
  1. The "winning" value.
    1. Example: if "Donnerstag" has=20 the most votes for 'thursday', then clicking on the empty field = will fill=20 in "Donnerstag"=20
  2. The singular form.
    • Example: if the value for 'hour' is "heure", then clicking on = the=20 entry field for 'hours' will insert "heure".
  3. The parent's value (except where that parent is Root)
    • Example: if I'm in [de_CH] and there are no proposals for = 'thursday',=20 then clicking on the empty field will fill in "Donnerstag" from [de].
  4. Finally, if there = is no other=20 information, and it is the Latin script, then fill in the English=20 value
    • Example: = "Afghanistan"=20
  5. Otherwise don't = fill in=20 anything.
So, would this make things enough easier = as to be=20 worth doing, in general? And if so, does the above proposal need=20 tweaking?

--
Mark
------=_NextPart_000_0001_01C88994.C9C623F0-- From mark.edward.davis@gmail.com Wed Mar 19 08:06:48 2008 Received: with ECARTIS (v1.0.0; list cldr-users); Wed, 19 Mar 2008 08:06:48 -0600 (CST) Received: from ti-out-0910.google.com (ti-out-0910.google.com [209.85.142.184]) by unicode.org (8.12.11/8.12.11) with ESMTP id m2JE6iqA008676 for ; Wed, 19 Mar 2008 08:06:47 -0600 Received: by ti-out-0910.google.com with SMTP id 28so137232tif.11 for ; Wed, 19 Mar 2008 07:06:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version:content-type:references:x-google-sender-auth; bh=mtmF3ZORo41VOoDpiDevyK7jr+9nGNV5DZoBHR7fiRs=; b=BZLHo7Obb8YJQD9YcCPGKPszaIWhiT1WNAIfIKU8q+r1MozSnlXYcyQWHpbWpHNqPZY8rfpYaon+4kLOpMgodpWd/ov1hasZsMVWa5EcgHFpWz5RjlqPDibuQFUfyMaXD+XhJeui6iIw7BNzbcggczF/VZbZ4H3kUDbMZ++s2n4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version:content-type:references:x-google-sender-auth; b=jSxrrqZZQuyVsyu7TFhCGR4WEBjuStJady6SvfpUNVTb50bt+lwhnYJ8HpX14JgXhp0tFsINL1YcLrAdxMqEWIqzmjU+QCgYDuHDll2inMtzxoie7f+EtZLT0SSlG2tnfjZtRFsnw0mLtKaQYZJaUQ0LFvrbLKNA7sf8bZWDNuA= Received: by 10.151.105.13 with SMTP id h13mr185259ybm.180.1205935601607; Wed, 19 Mar 2008 07:06:41 -0700 (PDT) Received: by 10.150.229.9 with HTTP; Wed, 19 Mar 2008 07:06:41 -0700 (PDT) Message-ID: <30b660a20803190706n51171946w9106a879ad6f0858@mail.gmail.com> Date: Wed, 19 Mar 2008 07:06:41 -0700 From: "Mark Davis" To: "CLDR list" Subject: Re: fill-in fields Cc: "cldr-users@unicode.org" In-Reply-To: <30b660a20803181850t5d72b81cvb089eae4d9d97bde@mail.gmail.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_13809_13127977.1205935601595" References: <30b660a20803181850t5d72b81cvb089eae4d9d97bde@mail.gmail.com> X-Google-Sender-Auth: 75010e70655c3d90 X-archive-position: 418 X-ecartis-version: Ecartis v1.0.0 Sender: cldr-users-bounce@unicode.org Errors-to: cldr-users-bounce@unicode.org X-original-sender: mark.davis@icu-project.org Precedence: bulk X-list: cldr-users ------=_Part_13809_13127977.1205935601595 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Content-Disposition: inline In response to a private message: What may not have been clear from my message was that this would only be as an aid for the translator, to give him/er some text to start with. There would be no automatic propagation; only if the translator clicked in the field would something show up. I am a bit uneasy about #4 also, just because there is a tendency anyway to align too closely to English. On the other hand, the English is a better starting point in many cases than just a code (eg for languages or counties). But I wanted to get others' opinions. Mark On Tue, Mar 18, 2008 at 6:50 PM, Mark Davis wrote: > During today's telecon, there we discussed trying to do the following: > > Ideally, when you click into an entry field (or on its radio button), of a > plural form, the text of the singular would appear so that it can be edited. > > > I was thinking that a generalization of the above might make data entry > easier in general. Namely, that when you click on an *empty* text entry > field (or its radio button), then we always try to populate it with a guess > which can then be edited further. That would save people on repetitive > cutting and pasting. > > Here a suggestion for what that guess should be, based on going through > the following list, and take the first that matches. > > 1. The "winning" value. > 1. Example: if "Donnerstag" has the most votes for 'thursday', > then clicking on the empty field will fill in "Donnerstag" > 2. The singular form. > - Example: if the value for 'hour' is "heure", then clicking > on the entry field for 'hours' will insert "heure". > 3. The parent's value (except where that parent is Root) > - Example: if I'm in [de_CH] and there are no proposals for > 'thursday', then clicking on the empty field will fill in "Donnerstag" > from [de]. > 4. Finally, if there is no other information, and it is the Latin > script, then fill in the English value > - Example: "Afghanistan" > 5. Otherwise don't fill in anything. > > So, would this make things enough easier as to be worth doing, in general? > And if so, does the above proposal need tweaking? > > -- > Mark -- Mark ------=_Part_13809_13127977.1205935601595 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 7bit Content-Disposition: inline In response to a private message:

What may not have been clear from my message was that this would only be as an aid for the translator, to give him/er some text to start with. There would be no automatic propagation; only if the translator clicked in the field would something show up.

I am a bit uneasy about #4 also, just because there is a tendency anyway to align too closely to English. On the other hand, the English is a better starting point in many cases than just a code (eg for languages or counties). But I wanted to get others' opinions.

Mark

On Tue, Mar 18, 2008 at 6:50 PM, Mark Davis <mark.davis@icu-project.org> wrote:
During today's telecon, there we discussed trying to do the following:

Ideally, when you click into an entry field (or on its radio button), of a plural form, the text of the singular would appear so that it can be edited.


I was thinking that a generalization of the above might make data entry easier in general. Namely, that when you click on an empty text entry field (or its radio button), then we always try to populate it with a guess which can then be edited further. That would save people on repetitive cutting and pasting.

Here a suggestion for what that guess should be, based on going through the following list, and take the first that matches.
  1. The "winning" value.
    1. Example: if "Donnerstag" has the most votes for 'thursday', then clicking on the empty field will fill in "Donnerstag"
  2. The singular form.
    • Example: if the value for 'hour' is "heure", then clicking on the entry field for 'hours' will insert "heure".
  3. The parent's value (except where that parent is Root)
    • Example: if I'm in [de_CH] and there are no proposals for 'thursday', then clicking on the empty field will fill in "Donnerstag" from [de].
  4. Finally, if there is no other information, and it is the Latin script, then fill in the English value
    • Example: "Afghanistan"
  5. Otherwise don't fill in anything.
So, would this make things enough easier as to be worth doing, in general? And if so, does the above proposal need tweaking?

--
Mark



--
Mark ------=_Part_13809_13127977.1205935601595-- From mark.edward.davis@gmail.com Wed Mar 19 15:10:11 2008 Received: with ECARTIS (v1.0.0; list cldr-users); Wed, 19 Mar 2008 15:10:11 -0600 (CST) Received: from gv-out-0910.google.com (gv-out-0910.google.com [216.239.58.186]) by unicode.org (8.12.11/8.12.11) with ESMTP id m2JLA9CH001573 for ; Wed, 19 Mar 2008 15:10:10 -0600 Received: by gv-out-0910.google.com with SMTP id l14so276323gvf.4 for ; Wed, 19 Mar 2008 14:10:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version:content-type:references:x-google-sender-auth; bh=/4ii+jI/z0pD1pVwDNqmQP/ryXHyMaPyiugqvQOrtd4=; b=qdMMpuxoW7csH5i33W3pffmeo+ETvAMGem6000fr4MJ585dX9sQFk3a3EZSzSlkYg+LeJB7nEfzJveLIolEcxHcMIEnSOUo/XLK5FF6jaKywetRfxX/uXTvCWIqoH1VrU9Ziv8GOn2zRrw8jWvzXIkJ8GADGB/8G5tWAiNvHllw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version:content-type:references:x-google-sender-auth; b=KZJ/pnKCBioR9H1qSz3Z0C7vffonPuMXbde4cW1jrm4P30dCtRflDRVD4yP9/FIpp9Fv2iWuluGM+M3B5ulTZvG2lgIlqxSqKxaFtmy970DxXVMvvJTCy3cJAEgBk0ji7ZomASypiOZ72qdyo5VHfuXUlJLAt4z6q7+F1MNT9E0= Received: by 10.150.91.20 with SMTP id o20mr456305ybb.24.1205961006422; Wed, 19 Mar 2008 14:10:06 -0700 (PDT) Received: by 10.150.229.9 with HTTP; Wed, 19 Mar 2008 14:10:06 -0700 (PDT) Message-ID: <30b660a20803191410q261536ddg54979c89bb33a27@mail.gmail.com> Date: Wed, 19 Mar 2008 14:10:06 -0700 From: "Mark Davis" To: "Chris Hansten" Subject: Re: fill-in fields Cc: "John Emmons" , "CLDR list" , "cldr-users@unicode.org" In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_15192_27678619.1205961006567" References: X-Google-Sender-Auth: a78264c38d3973bd X-archive-position: 419 X-ecartis-version: Ecartis v1.0.0 Sender: cldr-users-bounce@unicode.org Errors-to: cldr-users-bounce@unicode.org X-original-sender: mark.davis@icu-project.org Precedence: bulk X-list: cldr-users ------=_Part_15192_27678619.1205961006567 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Content-Disposition: inline What would happen is that if I click into an *empty* entry field, I'd get the "default" entered as if I'd typed it in. We already automatically set the radio button if there is any type-in in that field, so if I hit Save, it would be entered. If I wanted to reject it, then just like now I'd hit n/a. Or, I can delete everything in the field, since we reject empty text fields. Implementing that would be pretty simple; we could add a function in CLDRFile like getTextEntryFillIn(String path), and the Survey tool would call it if someone clicks in an *empty* text entry field. Your suggestions are interesting, but I'm worried about this getting too complicated and/or taking up too much screen realestate (like to show the source). Do you think that we can tweak what I've described to be useful and yet not risky? Mark On Wed, Mar 19, 2008 at 12:46 PM, Chris Hansten wrote: > I agree we want to make data entry as easy as possible. > > What may not have been clear from my message was that this would only be > as an aid for the translator, to give him/er some text to start with. There > would be no automatic propagation; only if the translator clicked in the > field would something show up. > > I am a bit uneasy about #4 also, just because there is a tendency anyway > to align too closely to English. On the other hand, the English is a better > starting point in many cases than just a code (eg for languages or > counties). But I wanted to get others' opinions. > > > For clarity on this, Mark - based on your comments above,* if* the > translator clicks on the field, the idea is that a candidate shows up that > they can edit/accept. My main question is what happens if the choose not to > edit the value, because they are not ready to complete this item. Say, they > look at the candidate, decide more research is needed, and move on to > something else. Do they have to delete the proposed value? What affirmative > step could they mistakenly take to save the proposed value without > finalizing it? > > I am extremely nervous about anything the makes it easier for bad data to > be saved by the translators. It seems to me it would be quite easy for ST > users to not realize this was a guess they are supposed to edit or review, > and instead accept the value as something proposed by the system. > > To avoid this, I would rather this translator candidate was a separate > field they would have to affirmatively copy/paste from or hit a button to > move to the edit field or something. > > Also, if the guesses can be of different kinds (as in 1-5 below), it > should probably be clear what the guess is in the UI, so the translator > knows what is it they are looking at ("winning value", "singular form", > "english value", etc) > > cheers > chris > > On Mar 19, 2008, at 7:25 AM, John Emmons wrote: > > > I think we need to do whatever we can to make the data entry easier for > people, and I think this is a good start. Items 1-3 seem to make good > sense. Item 4 ( "Finally, if there is no other information, and it is the > Latin script, then fill in the English value" ) I think should be limited > to fields where we think there is a reasonable chance that the vetter would > accept the English value as it stands. Some good examples of this would be > short metazone names such as PST, MST, CST, etc. or things like date/time > formats. I would NOT be in favor of doing this for fields that are pure > translation such as languages, scripts, and territories, since there is > almost no chance that the English value in the entry field would be even > close to the correct translated value, even in a language that uses Latin > script. In those cases, I think a blank field is much better. > > Regards, > > John C. Emmons > Globalization Architect > IBM Software Group, Austin TX > Ph. 512-838-8184/512-259-9051 > Internet: emmo@us.ibm.com > > > > *"Mark Davis" * > Sent by: cldr-bounce@unicode.org > > 03/18/2008 08:50 PM > To > "CLDR list" cc > "cldr-users@unicode.org" Subject > fill-in fields > > > > > During today's telecon, there we discussed trying to do the following: > > Ideally, when you click into an entry field (or on its radio button), of a > plural form, the text of the singular would appear so that it can be edited. > > > I was thinking that a generalization of the above might make data entry > easier in general. Namely, that when you click on an *empty* text entry > field (or its radio button), then we always try to populate it with a guess > which can then be edited further. That would save people on repetitive > cutting and pasting. > > Here a suggestion for what that guess should be, based on going through > the following list, and take the first that matches. > 1. The "winning" value. > 1. Example: if "Donnerstag" has the most votes for 'thursday', then > clicking on the empty field will fill in "Donnerstag" > 2. The singular form. > > - Example: if the value for 'hour' is "heure", then clicking on the > entry field for 'hours' will insert "heure". > > 3. The parent's value (except where that parent is Root) > > - Example: if I'm in [de_CH] and there are no proposals for > 'thursday', then clicking on the empty field will fill in "Donnerstag" from > [de]. > > 4. Finally, if there is no other information, and it is the Latin > script, then fill in the English value > > - Example: "Afghanistan" > > 5. Otherwise don't fill in anything. > So, would this make things enough easier as to be worth doing, in general? > And if so, does the above proposal need tweaking? > > -- > Mark > > > -- Mark ------=_Part_15192_27678619.1205961006567 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: base64 Content-Disposition: inline V2hhdCB3b3VsZCBoYXBwZW4gaXMgdGhhdCBpZiBJIGNsaWNrIGludG8gYW4gKmVtcHR5KiBlbnRy eSBmaWVsZCwgSSYjMzk7ZCBnZXQgdGhlICZxdW90O2RlZmF1bHQmcXVvdDsgZW50ZXJlZCBhcyBp ZiBJJiMzOTtkIHR5cGVkIGl0IGluLiBXZSBhbHJlYWR5IGF1dG9tYXRpY2FsbHkgc2V0IHRoZSBy YWRpbyBidXR0b24gaWYgdGhlcmUgaXMgYW55IHR5cGUtaW4gaW4gdGhhdCBmaWVsZCwgc28gaWYg SSBoaXQgU2F2ZSwgaXQgd291bGQgYmUgZW50ZXJlZC4gSWYgSSB3YW50ZWQgdG8gcmVqZWN0IGl0 LCB0aGVuIGp1c3QgbGlrZSBub3cgSSYjMzk7ZCBoaXQgbi9hLiBPciwgSSBjYW4gZGVsZXRlIGV2 ZXJ5dGhpbmcgaW4gdGhlIGZpZWxkLCBzaW5jZSB3ZSByZWplY3QgZW1wdHkgdGV4dCBmaWVsZHMu PGJyPgo8YnI+SW1wbGVtZW50aW5nIHRoYXQgd291bGQgYmUgcHJldHR5IHNpbXBsZTsgd2UgY291 bGQgYWRkIGEgZnVuY3Rpb24gaW4gQ0xEUkZpbGUgbGlrZSBnZXRUZXh0RW50cnlGaWxsSW4oU3Ry aW5nIHBhdGgpLCBhbmQgdGhlIFN1cnZleSB0b29sIHdvdWxkIGNhbGwgaXQgaWYgc29tZW9uZSBj bGlja3MgaW4gYW4gKmVtcHR5KiB0ZXh0IGVudHJ5IGZpZWxkLjxicj48YnI+WW91ciBzdWdnZXN0 aW9ucyBhcmUgaW50ZXJlc3RpbmcsIGJ1dCBJJiMzOTttIHdvcnJpZWQgYWJvdXQgdGhpcyBnZXR0 aW5nIHRvbyBjb21wbGljYXRlZCBhbmQvb3IgdGFraW5nIHVwIHRvbyBtdWNoIHNjcmVlbiByZWFs ZXN0YXRlIChsaWtlIHRvIHNob3cgdGhlIHNvdXJjZSkuIERvIHlvdSB0aGluayB0aGF0IHdlIGNh biB0d2VhayB3aGF0IEkmIzM5O3ZlIGRlc2NyaWJlZCB0byBiZSB1c2VmdWwgYW5kIHlldCBub3Qg cmlza3k/PGJyPgo8YnI+TWFyazxicj48YnI+PGRpdiBjbGFzcz0iZ21haWxfcXVvdGUiPk9uIFdl ZCwgTWFyIDE5LCAyMDA4IGF0IDEyOjQ2IFBNLCBDaHJpcyBIYW5zdGVuICZsdDs8YSBocmVmPSJt YWlsdG86Y2hyaXNoQGFwcGxlLmNvbSI+Y2hyaXNoQGFwcGxlLmNvbTwvYT4mZ3Q7IHdyb3RlOjxi cj48YmxvY2txdW90ZSBjbGFzcz0iZ21haWxfcXVvdGUiIHN0eWxlPSJib3JkZXItbGVmdDogMXB4 IHNvbGlkIHJnYigyMDQsIDIwNCwgMjA0KTsgbWFyZ2luOiAwcHQgMHB0IDBwdCAwLjhleDsgcGFk ZGluZy1sZWZ0OiAxZXg7Ij4KPGRpdiBzdHlsZT0iIj5JIGFncmVlIHdlIHdhbnQgdG8gbWFrZSBk YXRhIGVudHJ5IGFzIGVhc3kgYXMgcG9zc2libGUuJm5ic3A7PGRpdj48YnI+PC9kaXY+PGRpdj48 ZGl2IGNsYXNzPSJJaDJFM2QiPjxibG9ja3F1b3RlIHR5cGU9ImNpdGUiPldoYXQgbWF5IG5vdCBo YXZlIGJlZW4gY2xlYXIgZnJvbSBteSBtZXNzYWdlIHdhcyB0aGF0IHRoaXMgd291bGQgb25seSBi ZSBhcyBhbiBhaWQgZm9yIHRoZSB0cmFuc2xhdG9yLCB0byBnaXZlIGhpbS9lciBzb21lIHRleHQg dG8gc3RhcnQgd2l0aC4gVGhlcmUgd291bGQgYmUgbm8gYXV0b21hdGljIHByb3BhZ2F0aW9uOyBv bmx5IGlmIHRoZSB0cmFuc2xhdG9yIGNsaWNrZWQgaW4gdGhlIGZpZWxkIHdvdWxkIHNvbWV0aGlu ZyBzaG93IHVwLjxicj4KPGJyPkkgYW0gYSBiaXQgdW5lYXN5IGFib3V0ICM0IGFsc28sIGp1c3Qg YmVjYXVzZSB0aGVyZSBpcyBhIHRlbmRlbmN5IGFueXdheSB0byBhbGlnbiB0b28gY2xvc2VseSB0 byBFbmdsaXNoLiBPbiB0aGUgb3RoZXIgaGFuZCwgdGhlIEVuZ2xpc2ggaXMgYSBiZXR0ZXIgc3Rh cnRpbmcgcG9pbnQgaW4gbWFueSBjYXNlcyB0aGFuIGp1c3QgYSBjb2RlIChlZyBmb3IgbGFuZ3Vh Z2VzIG9yIGNvdW50aWVzKS4gQnV0IEkgd2FudGVkIHRvIGdldCBvdGhlcnMmIzM5OyBvcGluaW9u cy48YnI+CjwvYmxvY2txdW90ZT48ZGl2Pjxicj48L2Rpdj48L2Rpdj48ZGl2PkZvciBjbGFyaXR5 IG9uIHRoaXMsIE1hcmsgLSBiYXNlZCBvbiB5b3VyIGNvbW1lbnRzIGFib3ZlLDxiPiBpZjwvYj4m bmJzcDt0aGUgdHJhbnNsYXRvciBjbGlja3Mgb24gdGhlIGZpZWxkLCB0aGUgaWRlYSBpcyB0aGF0 IGEgY2FuZGlkYXRlIHNob3dzIHVwIHRoYXQgdGhleSBjYW4gZWRpdC9hY2NlcHQuIE15IG1haW4g cXVlc3Rpb24gaXMgd2hhdCBoYXBwZW5zIGlmIHRoZSBjaG9vc2Ugbm90IHRvIGVkaXQgdGhlIHZh bHVlLCBiZWNhdXNlIHRoZXkgYXJlIG5vdCByZWFkeSB0byBjb21wbGV0ZSB0aGlzIGl0ZW0uIFNh eSwgdGhleSBsb29rIGF0IHRoZSBjYW5kaWRhdGUsIGRlY2lkZSBtb3JlIHJlc2VhcmNoIGlzIG5l ZWRlZCwgYW5kIG1vdmUgb24gdG8gc29tZXRoaW5nIGVsc2UuIERvIHRoZXkgaGF2ZSB0byBkZWxl dGUgdGhlIHByb3Bvc2VkIHZhbHVlPyBXaGF0IGFmZmlybWF0aXZlIHN0ZXAgY291bGQgdGhleSBt aXN0YWtlbmx5IHRha2UgdG8gc2F2ZSB0aGUgcHJvcG9zZWQgdmFsdWUgd2l0aG91dCBmaW5hbGl6 aW5nIGl0PyZuYnNwOzwvZGl2Pgo8ZGl2Pjxicj48L2Rpdj48ZGl2PkkgYW0gZXh0cmVtZWx5IG5l cnZvdXMgYWJvdXQgYW55dGhpbmcgdGhlIG1ha2VzIGl0IGVhc2llciBmb3IgYmFkIGRhdGEgdG8g YmUgc2F2ZWQgYnkgdGhlIHRyYW5zbGF0b3JzLiBJdCBzZWVtcyB0byBtZSBpdCB3b3VsZCBiZSBx dWl0ZSBlYXN5IGZvciBTVCB1c2VycyB0byBub3QgcmVhbGl6ZSB0aGlzIHdhcyBhIGd1ZXNzIHRo ZXkgYXJlIHN1cHBvc2VkIHRvIGVkaXQgb3IgcmV2aWV3LCBhbmQgaW5zdGVhZCBhY2NlcHQgdGhl IHZhbHVlIGFzIHNvbWV0aGluZyBwcm9wb3NlZCBieSB0aGUgc3lzdGVtLiZuYnNwOzwvZGl2Pgo8 ZGl2Pjxicj48L2Rpdj48ZGl2PlRvIGF2b2lkIHRoaXMsIEkgd291bGQgcmF0aGVyIHRoaXMgdHJh bnNsYXRvciBjYW5kaWRhdGUgd2FzIGEgc2VwYXJhdGUgZmllbGQgdGhleSB3b3VsZCBoYXZlIHRv IGFmZmlybWF0aXZlbHkgY29weS9wYXN0ZSBmcm9tIG9yIGhpdCBhIGJ1dHRvbiB0byBtb3ZlIHRv IHRoZSBlZGl0IGZpZWxkIG9yIHNvbWV0aGluZy4mbmJzcDs8L2Rpdj48ZGl2Pjxicj48L2Rpdj4K PGRpdj5BbHNvLCBpZiB0aGUgZ3Vlc3NlcyBjYW4gYmUgb2YgZGlmZmVyZW50IGtpbmRzIChhcyBp biAxLTUgYmVsb3cpLCBpdCBzaG91bGQgcHJvYmFibHkgYmUgY2xlYXIgd2hhdCB0aGUgZ3Vlc3Mg aXMgaW4gdGhlIFVJLCBzbyB0aGUgdHJhbnNsYXRvciBrbm93cyB3aGF0IGlzIGl0IHRoZXkgYXJl IGxvb2tpbmcgYXQgKCZxdW90O3dpbm5pbmcgdmFsdWUmcXVvdDssICZxdW90O3Npbmd1bGFyIGZv cm0mcXVvdDssICZxdW90O2VuZ2xpc2ggdmFsdWUmcXVvdDssIGV0Yyk8YnI+CjxkaXY+PGJyPjwv ZGl2PjxkaXY+Y2hlZXJzPC9kaXY+PGRpdj5jaHJpczwvZGl2PjxkaXY+PGRpdj48L2Rpdj48ZGl2 IGNsYXNzPSJXajNDN2MiPjxkaXY+PGJyPjxkaXY+PGRpdj5PbiBNYXIgMTksIDIwMDgsIGF0IDc6 MjUgQU0sIEpvaG4gRW1tb25zIHdyb3RlOjwvZGl2Pjxicj48YmxvY2txdW90ZSB0eXBlPSJjaXRl Ij48YnI+PGZvbnQgZmFjZT0ic2Fucy1zZXJpZiIgc2l6ZT0iMiI+SSB0aGluayB3ZSBuZWVkIHRv IGRvIHdoYXRldmVyIHdlIGNhbiB0byBtYWtlIHRoZSBkYXRhIGVudHJ5IGVhc2llciBmb3IgcGVv cGxlLCBhbmQgSSB0aGluayB0aGlzIGlzIGEgZ29vZCBzdGFydC4gJm5ic3A7SXRlbXMgMS0zIHNl ZW0gdG8gbWFrZSBnb29kIHNlbnNlLiAmbmJzcDtJdGVtIDQgKCAmcXVvdDs8L2ZvbnQ+PGZvbnQg c2l6ZT0iMyI+RmluYWxseSwgaWYgdGhlcmUgaXMgbm8gb3RoZXIgaW5mb3JtYXRpb24sIGFuZCBp dCBpcyB0aGUgTGF0aW4gc2NyaXB0LCB0aGVuIGZpbGwgaW4gdGhlIEVuZ2xpc2ggdmFsdWU8L2Zv bnQ+PGZvbnQgZmFjZT0ic2Fucy1zZXJpZiIgc2l6ZT0iMiI+JnF1b3Q7ICkgSSB0aGluayBzaG91 bGQgYmUgbGltaXRlZCB0byBmaWVsZHMgd2hlcmUgd2UgdGhpbmsgdGhlcmUgaXMgYSByZWFzb25h YmxlIGNoYW5jZSB0aGF0IHRoZSB2ZXR0ZXIgd291bGQgYWNjZXB0IHRoZSBFbmdsaXNoIHZhbHVl IGFzIGl0IHN0YW5kcy4gJm5ic3A7U29tZSBnb29kIGV4YW1wbGVzIG9mIHRoaXMgd291bGQgYmUg c2hvcnQgbWV0YXpvbmUgbmFtZXMgc3VjaCBhcyBQU1QsIE1TVCwgQ1NULCBldGMuIG9yIHRoaW5n cyBsaWtlIGRhdGUvdGltZSBmb3JtYXRzLiAmbmJzcDsgSSB3b3VsZCBOT1QgYmUgaW4gZmF2b3Ig b2YgZG9pbmcgdGhpcyBmb3IgZmllbGRzIHRoYXQgYXJlIHB1cmUgdHJhbnNsYXRpb24gc3VjaCBh cyBsYW5ndWFnZXMsIHNjcmlwdHMsIGFuZCB0ZXJyaXRvcmllcywgc2luY2UgdGhlcmUgaXMgYWxt b3N0IG5vIGNoYW5jZSB0aGF0IHRoZSBFbmdsaXNoIHZhbHVlIGluIHRoZSBlbnRyeSBmaWVsZCB3 b3VsZCBiZSBldmVuIGNsb3NlIHRvIHRoZSBjb3JyZWN0IHRyYW5zbGF0ZWQgdmFsdWUsIGV2ZW4g aW4gYSBsYW5ndWFnZSB0aGF0IHVzZXMgTGF0aW4gc2NyaXB0LiAmbmJzcDtJbiB0aG9zZSBjYXNl cywgSSB0aGluayBhIGJsYW5rIGZpZWxkIGlzIG11Y2ggYmV0dGVyLjwvZm9udD4gPGJyPgo8Zm9u dCBmYWNlPSJzYW5zLXNlcmlmIiBzaXplPSIyIj48YnI+IFJlZ2FyZHMsPGJyPiA8YnI+IEpvaG4g Qy4gRW1tb25zPGJyPiBHbG9iYWxpemF0aW9uIEFyY2hpdGVjdDxicj4gSUJNIFNvZnR3YXJlIEdy b3VwLCBBdXN0aW4gVFg8YnI+IFBoLiA1MTItODM4LTgxODQvNTEyLTI1OS05MDUxPGJyPiBJbnRl cm5ldDogPGEgaHJlZj0ibWFpbHRvOmVtbW9AdXMuaWJtLmNvbSIgdGFyZ2V0PSJfYmxhbmsiPmVt bW9AdXMuaWJtLmNvbTwvYT48YnI+CiA8L2ZvbnQ+IDxicj4gPGJyPiA8YnI+IDx0YWJsZSB3aWR0 aD0iMTAwJSI+IDx0Ym9keT48dHIgdmFsaWduPSJ0b3AiPiA8dGQgd2lkdGg9IjQwJSI+PGZvbnQg ZmFjZT0ic2Fucy1zZXJpZiIgc2l6ZT0iMSI+PGI+JnF1b3Q7TWFyayBEYXZpcyZxdW90OyAmbHQ7 PGEgaHJlZj0ibWFpbHRvOm1hcmsuZGF2aXNAaWN1LXByb2plY3Qub3JnIiB0YXJnZXQ9Il9ibGFu ayI+bWFyay5kYXZpc0BpY3UtcHJvamVjdC5vcmc8L2E+Jmd0OzwvYj4gPC9mb250PiA8YnI+Cjxm b250IGZhY2U9InNhbnMtc2VyaWYiIHNpemU9IjEiPlNlbnQgYnk6IDxhIGhyZWY9Im1haWx0bzpj bGRyLWJvdW5jZUB1bmljb2RlLm9yZyIgdGFyZ2V0PSJfYmxhbmsiPmNsZHItYm91bmNlQHVuaWNv ZGUub3JnPC9hPjwvZm9udD48cD48Zm9udCBmYWNlPSJzYW5zLXNlcmlmIiBzaXplPSIxIj4wMy8x OC8yMDA4IDA4OjUwIFBNPC9mb250PiA8L3A+PC90ZD48dGQgd2lkdGg9IjU5JSI+CiA8dGFibGUg d2lkdGg9IjEwMCUiPiA8dGJvZHk+PHRyIHZhbGlnbj0idG9wIj4gPHRkPiA8ZGl2IGFsaWduPSJy aWdodCI+PGZvbnQgZmFjZT0ic2Fucy1zZXJpZiIgc2l6ZT0iMSI+VG88L2ZvbnQ+PC9kaXY+IDwv dGQ+PHRkPjxmb250IGZhY2U9InNhbnMtc2VyaWYiIHNpemU9IjEiPiZxdW90O0NMRFIgbGlzdCZx dW90OyAmbHQ7PGEgaHJlZj0ibWFpbHRvOmNsZHJAdW5pY29kZS5vcmciIHRhcmdldD0iX2JsYW5r Ij5jbGRyQHVuaWNvZGUub3JnPC9hPiZndDs8L2ZvbnQ+IDwvdGQ+CjwvdHI+PHRyIHZhbGlnbj0i dG9wIj4gPHRkPiA8ZGl2IGFsaWduPSJyaWdodCI+PGZvbnQgZmFjZT0ic2Fucy1zZXJpZiIgc2l6 ZT0iMSI+Y2M8L2ZvbnQ+PC9kaXY+IDwvdGQ+PHRkPjxmb250IGZhY2U9InNhbnMtc2VyaWYiIHNp emU9IjEiPiZxdW90OzxhIGhyZWY9Im1haWx0bzpjbGRyLXVzZXJzQHVuaWNvZGUub3JnIiB0YXJn ZXQ9Il9ibGFuayI+Y2xkci11c2Vyc0B1bmljb2RlLm9yZzwvYT4mcXVvdDsgJmx0OzxhIGhyZWY9 Im1haWx0bzpjbGRyLXVzZXJzQHVuaWNvZGUub3JnIiB0YXJnZXQ9Il9ibGFuayI+Y2xkci11c2Vy c0B1bmljb2RlLm9yZzwvYT4mZ3Q7PC9mb250PiA8L3RkPgo8L3RyPjx0ciB2YWxpZ249InRvcCI+ IDx0ZD4gPGRpdiBhbGlnbj0icmlnaHQiPjxmb250IGZhY2U9InNhbnMtc2VyaWYiIHNpemU9IjEi PlN1YmplY3Q8L2ZvbnQ+PC9kaXY+IDwvdGQ+PHRkPjxmb250IGZhY2U9InNhbnMtc2VyaWYiIHNp emU9IjEiPmZpbGwtaW4gZmllbGRzPC9mb250PjwvdGQ+PC90cj48L3Rib2R5PjwvdGFibGU+IDxi cj4gPHRhYmxlPiA8dGJvZHk+PHRyIHZhbGlnbj0idG9wIj4KIDx0ZD4gPC90ZD48dGQ+PC90ZD48 L3RyPjwvdGJvZHk+PC90YWJsZT4gPGJyPjwvdGQ+PC90cj48L3Rib2R5PjwvdGFibGU+IDxicj4g PGJyPiA8YnI+PGZvbnQgc2l6ZT0iMyI+RHVyaW5nIHRvZGF5JiMzOTtzIHRlbGVjb24sIHRoZXJl IHdlIGRpc2N1c3NlZCB0cnlpbmcgdG8gZG8gdGhlIGZvbGxvd2luZzo8YnI+IDwvZm9udD4gPGJy Pjxmb250IHNpemU9IjMiPklkZWFsbHksIHdoZW4geW91IGNsaWNrIGludG8gYW4gZW50cnkgZmll bGQgKG9yIG9uIGl0cyByYWRpbyBidXR0b24pLCBvZiBhIHBsdXJhbCBmb3JtLCB0aGUgdGV4dCBv ZiB0aGUgc2luZ3VsYXIgd291bGQgYXBwZWFyIHNvIHRoYXQgaXQgY2FuIGJlIGVkaXRlZC48L2Zv bnQ+IDxicj4KPGZvbnQgc2l6ZT0iMyI+PGJyPiA8YnI+IEkgd2FzIHRoaW5raW5nIHRoYXQgYSBn ZW5lcmFsaXphdGlvbiBvZiB0aGUgYWJvdmUgbWlnaHQgbWFrZSBkYXRhIGVudHJ5IGVhc2llciBp biBnZW5lcmFsLiBOYW1lbHksIHRoYXQgd2hlbiB5b3UgY2xpY2sgb24gYW4gPGk+ZW1wdHk8L2k+ IHRleHQgZW50cnkgZmllbGQgKG9yIGl0cyByYWRpbyBidXR0b24pLCB0aGVuIHdlIGFsd2F5cyB0 cnkgdG8gcG9wdWxhdGUgaXQgd2l0aCBhIGd1ZXNzIHdoaWNoIGNhbiB0aGVuIGJlIGVkaXRlZCBm dXJ0aGVyLiBUaGF0IHdvdWxkIHNhdmUgcGVvcGxlIG9uIHJlcGV0aXRpdmUgY3V0dGluZyBhbmQg cGFzdGluZy48YnI+CiA8YnI+IEhlcmUgYSBzdWdnZXN0aW9uIGZvciB3aGF0IHRoYXQgZ3Vlc3Mg c2hvdWxkIGJlLCBiYXNlZCBvbiBnb2luZyB0aHJvdWdoIHRoZSBmb2xsb3dpbmcgbGlzdCwgYW5k IHRha2UgdGhlIGZpcnN0IHRoYXQgbWF0Y2hlcy48L2ZvbnQ+IDxicj48Zm9udCBmYWNlPSJzYW5z LXNlcmlmIiBzaXplPSIyIj4xLiAmbmJzcDsgJm5ic3A7ICZuYnNwOyAmbmJzcDs8L2ZvbnQ+PGZv bnQgc2l6ZT0iMyI+VGhlICZxdW90O3dpbm5pbmcmcXVvdDsgdmFsdWUuPC9mb250PiA8YnI+Cjxm b250IGZhY2U9InNhbnMtc2VyaWYiIHNpemU9IjIiPjEuICZuYnNwOyAmbmJzcDsgJm5ic3A7ICZu YnNwOzwvZm9udD48Zm9udCBzaXplPSIzIj5FeGFtcGxlOiBpZiAmcXVvdDtEb25uZXJzdGFnJnF1 b3Q7IGhhcyB0aGUgbW9zdCB2b3RlcyBmb3IgJiMzOTt0aHVyc2RheSYjMzk7LCB0aGVuIGNsaWNr aW5nIG9uIHRoZSBlbXB0eSBmaWVsZCB3aWxsIGZpbGwgaW4gJnF1b3Q7RG9ubmVyc3RhZyZxdW90 OzwvZm9udD4gPGJyPgo8Zm9udCBmYWNlPSJzYW5zLXNlcmlmIiBzaXplPSIyIj4yLiAmbmJzcDsg Jm5ic3A7ICZuYnNwOyAmbmJzcDs8L2ZvbnQ+PGZvbnQgc2l6ZT0iMyI+VGhlIHNpbmd1bGFyIGZv cm0uPC9mb250PiA8dWw+IDxsaT48Zm9udCBzaXplPSIzIj5FeGFtcGxlOiBpZiB0aGUgdmFsdWUg Zm9yICYjMzk7aG91ciYjMzk7IGlzICZxdW90O2hldXJlJnF1b3Q7LCB0aGVuIGNsaWNraW5nIG9u IHRoZSBlbnRyeSBmaWVsZCBmb3IgJiMzOTtob3VycyYjMzk7IHdpbGwgaW5zZXJ0ICZxdW90O2hl dXJlJnF1b3Q7LjwvZm9udD48L2xpPgo8L3VsPjxmb250IGZhY2U9InNhbnMtc2VyaWYiIHNpemU9 IjIiPjMuICZuYnNwOyAmbmJzcDsgJm5ic3A7ICZuYnNwOzwvZm9udD48Zm9udCBzaXplPSIzIj5U aGUgcGFyZW50JiMzOTtzIHZhbHVlIChleGNlcHQgd2hlcmUgdGhhdCBwYXJlbnQgaXMgUm9vdCk8 L2ZvbnQ+IDx1bD4gPGxpPjxmb250IHNpemU9IjMiPkV4YW1wbGU6IGlmIEkmIzM5O20gaW4gW2Rl X0NIXSBhbmQgdGhlcmUgYXJlIG5vIHByb3Bvc2FscyBmb3IgJiMzOTt0aHVyc2RheSYjMzk7LCB0 aGVuIGNsaWNraW5nIG9uIHRoZSBlbXB0eSBmaWVsZCB3aWxsIGZpbGwgaW4gJnF1b3Q7RG9ubmVy c3RhZyZxdW90OyBmcm9tIFtkZV0uPC9mb250PjwvbGk+CjwvdWw+PGZvbnQgZmFjZT0ic2Fucy1z ZXJpZiIgc2l6ZT0iMiI+NC4gJm5ic3A7ICZuYnNwOyAmbmJzcDsgJm5ic3A7PC9mb250Pjxmb250 IHNpemU9IjMiPkZpbmFsbHksIGlmIHRoZXJlIGlzIG5vIG90aGVyIGluZm9ybWF0aW9uLCBhbmQg aXQgaXMgdGhlIExhdGluIHNjcmlwdCwgdGhlbiBmaWxsIGluIHRoZSBFbmdsaXNoIHZhbHVlPC9m b250PiA8dWw+IDxsaT48Zm9udCBzaXplPSIzIj5FeGFtcGxlOiAmcXVvdDtBZmdoYW5pc3RhbiZx dW90OzwvZm9udD48L2xpPgo8L3VsPjxmb250IGZhY2U9InNhbnMtc2VyaWYiIHNpemU9IjIiPjUu ICZuYnNwOyAmbmJzcDsgJm5ic3A7ICZuYnNwOzwvZm9udD48Zm9udCBzaXplPSIzIj5PdGhlcndp c2UgZG9uJiMzOTt0IGZpbGwgaW4gYW55dGhpbmcuPC9mb250PiA8YnI+PGZvbnQgc2l6ZT0iMyI+ U28sIHdvdWxkIHRoaXMgbWFrZSB0aGluZ3MgZW5vdWdoIGVhc2llciBhcyB0byBiZSB3b3J0aCBk b2luZywgaW4gZ2VuZXJhbD8gQW5kIGlmIHNvLCBkb2VzIHRoZSBhYm92ZSBwcm9wb3NhbCBuZWVk IHR3ZWFraW5nPzxicj4KIDxicj4gLS0gPGJyPiBNYXJrIDwvZm9udD4gPGJyPjwvYmxvY2txdW90 ZT48L2Rpdj48YnI+PC9kaXY+PC9kaXY+PC9kaXY+PC9kaXY+PC9kaXY+PC9kaXY+PC9ibG9ja3F1 b3RlPjwvZGl2Pjxicj48YnIgY2xlYXI9ImFsbCI+PGJyPi0tIDxicj5NYXJrCg== ------=_Part_15192_27678619.1205961006567-- From naz@mira.net Wed Mar 19 17:20:46 2008 Received: with ECARTIS (v1.0.0; list cldr-users); Wed, 19 Mar 2008 17:20:46 -0600 (CST) Received: from smtp-auth.no-ip.com (smtp-auth.no-ip.com [204.16.252.95]) by unicode.org (8.12.11/8.12.11) with ESMTP id m2JNKjdw009709; Wed, 19 Mar 2008 17:20:45 -0600 X-No-IP: mrnaz.com@noip-smtp X-Report-Spam-To: abuse@no-ip.com Received: from [192.168.0.6] (ppp121-44-226-239.lns2.mel4.internode.on.net [121.44.226.239]) (using TLSv1 with cipher RC4-MD5 (128/128 bits)) (No client certificate requested) (Authenticated sender: mrnaz.com@noip-smtp) by smtp-auth.no-ip.com (Postfix) with ESMTP id 15A2ABCB8; Wed, 19 Mar 2008 16:20:43 -0700 (PDT) Message-ID: <47E19FC9.20506@mira.net> Date: Thu, 20 Mar 2008 10:20:41 +1100 From: Naz Gassiep User-Agent: Thunderbird 2.0.0.12 (Windows/20080213) MIME-Version: 1.0 To: CLDR list CC: "cldr-users@unicode.org" Subject: Re: fill-in fields References: <30b660a20803181850t5d72b81cvb089eae4d9d97bde@mail.gmail.com> In-Reply-To: <30b660a20803181850t5d72b81cvb089eae4d9d97bde@mail.gmail.com> Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 7bit X-archive-position: 420 X-ecartis-version: Ecartis v1.0.0 Sender: cldr-users-bounce@unicode.org Errors-to: cldr-users-bounce@unicode.org X-original-sender: naz@mira.net Precedence: bulk X-list: cldr-users Given the precise nature of the information required here and the desire that all data entry items be thoroughly thought about before being entered, any auto-fill mechanism should, IMHO, be approached with great caution. The only one that I think would be a good idea is proposal 3, however I would make it such that the "inherited" value would only propagate to the empty field automatically if it was the same script. Otherwise for example zh_Hant would inherit values from zh which is obviously not right.

4 is a terrible idea, as you'd get English values put into French, Italian, Spanish and every other Latin scripted language, which is definitely not a good idea. Remember also that any properly implemented app will have a fallback default anyway, and providing an English fallback would override that. IMHO, this is undesirable. The ultimate falback should be up to the application developer. For example, in my application, if no translation for a country name is found, it defaults to the English name as it appears in ISO 3166-1 regardless of the selected locale. This behaviour may be different for an application developed in China or the Middle East.

I think what I'm trying to say is that I'm against almost any move to automate what needs to be an exactingly precise process. Auto-fill fields to a certain extent, tend to take the brain out of the loop.

Regards,
- Naz

Mark Davis wrote:
During today's telecon, there we discussed trying to do the following:

Ideally, when you click into an entry field (or on its radio button), of a plural form, the text of the singular would appear so that it can be edited.


I was thinking that a generalization of the above might make data entry easier in general. Namely, that when you click on an empty text entry field (or its radio button), then we always try to populate it with a guess which can then be edited further. That would save people on repetitive cutting and pasting.

Here a suggestion for what that guess should be, based on going through the following list, and take the first that matches.
  1. The "winning" value.
    1. Example: if "Donnerstag" has the most votes for 'thursday', then clicking on the empty field will fill in "Donnerstag"
  2. The singular form.
    • Example: if the value for 'hour' is "heure", then clicking on the entry field for 'hours' will insert "heure".
  3. The parent's value (except where that parent is Root)
    • Example: if I'm in [de_CH] and there are no proposals for 'thursday', then clicking on the empty field will fill in "Donnerstag" from [de].
  4. Finally, if there is no other information, and it is the Latin script, then fill in the English value
    • Example: "Afghanistan"
  5. Otherwise don't fill in anything.
So, would this make things enough easier as to be worth doing, in general? And if so, does the above proposal need tweaking?

--
Mark
From srl@icu-project.org Wed Mar 19 17:32:22 2008 Received: with ECARTIS (v1.0.0; list cldr-users); Wed, 19 Mar 2008 17:32:22 -0600 (CST) Received: from k2smtpout02-01.prod.mesa1.secureserver.net (k2smtpout02-01.prod.mesa1.secureserver.net [64.202.189.90]) by unicode.org (8.12.11/8.12.11) with SMTP id m2JNWMXo014398 for ; Wed, 19 Mar 2008 17:32:22 -0600 Received: (qmail 4299 invoked from network); 19 Mar 2008 23:32:21 -0000 Received: from unknown (HELO ssl.icu-project.org) (208.109.248.225) by k2smtpout02-01.prod.mesa1.secureserver.net (64.202.189.90) with ESMTP; 19 Mar 2008 23:32:21 -0000 Received: from [129.42.184.35] (helo=dyn741779.sanjose.ibm.com) by ssl.icu-project.org with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.66) (envelope-from ) id 1Jc7jQ-00059R-Rq; Wed, 19 Mar 2008 16:30:04 -0700 Message-ID: <47E1A27C.2040301@icu-project.org> Date: Wed, 19 Mar 2008 16:32:12 -0700 From: "Steven R. Loomis" User-Agent: Thunderbird 2.0.0.12 (Macintosh/20080213) MIME-Version: 1.0 To: Chris Hansten CC: Mark Davis , John Emmons , CLDR list , "cldr-users@unicode.org" Subject: Re: fill-in fields References: <30b660a20803191410q261536ddg54979c89bb33a27@mail.gmail.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 421 X-ecartis-version: Ecartis v1.0.0 Sender: cldr-users-bounce@unicode.org Errors-to: cldr-users-bounce@unicode.org X-original-sender: srl@icu-project.org Precedence: bulk X-list: cldr-users It would probably be good for someone to start mocking these up. Since the clicking part is done with javascript, it could be posted somewhere as a static page, for testing -s Chris Hansten wrote: > Hmm...I'm still a little worried. If the value is populated when the > click there, and doing nothing means it remains, and there is no way > to visually distinguish it from something they entered themselves, I > can just see too many cases where someone might accidentally keep the > value. > > Can the default value when populated be visually distinguished at the > point it is populated? For example, could it be bright red, and > subsequent edits they made to the field cause it to revert to the > normal black? Then I might not be as concerned. > > cheers > chris > On Mar 19, 2008, at 2:10 PM, Mark Davis wrote: > >> uld happen is that if I click into an *empty* entry field, I'd get >> the "default" entered as if I'd typed it in. We already automatically >> set the radio button if there is any type-in in that field, so if I >> hit Save, it would be entered. If I wanted to reject it, then just >> like now I'd hit n/a. Or, I can delete everything in the field, since >> we reject empty text fields. >> >> Implementing that would be pretty simple; we could add a function in >> CLDRFile like getTextEntryFillIn(String path), and the Survey tool >> would call it if someone clicks in an *empty* text entry field. >> >> Your suggestions are interesting, but I'm worried about this getting >> too complicated and/or taking up too much screen realestate (like to >> show the source). Do you think that we can tweak what I've described >> to be useful and yet not risky? >> >> Mark >> >> > From mark.edward.davis@gmail.com Sun Mar 23 21:46:49 2008 Received: with ECARTIS (v1.0.0; list cldr-users); Sun, 23 Mar 2008 21:46:49 -0600 (CST) Received: from hs-out-0708.google.com (hs-out-0708.google.com [64.233.178.250]) by unicode.org (8.12.11/8.12.11) with ESMTP id m2O3knhh009298 for ; Sun, 23 Mar 2008 21:46:49 -0600 Received: by hs-out-0708.google.com with SMTP id x43so2305478hsb.3 for ; Sun, 23 Mar 2008 20:46:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version:content-type:references:x-google-sender-auth; bh=Fm1lb01kd4niZHNyuEtF2C4+FYejiyHQ7VEFzg8srhU=; b=kglZUurDvCPyxkT4dtRD8M7LbeOY0dY4TfOJkgJghukHl4fXT7mw2/yRZB82+VwI0lezqu2Seg+29tmS11B90Zt1R/Y+nKXSR8xaQn74XmAoHG/MzOtpek86zslDtjVlsoguUiNdMmzunzs9MQK8MooBKor33njLWhsuo4MxDnA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version:content-type:references:x-google-sender-auth; b=FicmGF6fuUv6T/BbZEzNOWn7g42udmfnxIXPVTR7Q/STHVvtfd+DEzzzSruwgoo3r6NAnuShYib5dadb7V87QG7gdJmSrMFjrTliGv6C5fSSNESI2nsW1E9DbSGbKkTZAhNzRqcGkMEqvFhhQkXHpHD1Sf+P/wBEnxk+1/Mc1qw= Received: by 10.151.112.3 with SMTP id p3mr2843360ybm.192.1206330408822; Sun, 23 Mar 2008 20:46:48 -0700 (PDT) Received: by 10.150.229.9 with HTTP; Sun, 23 Mar 2008 20:46:48 -0700 (PDT) Message-ID: <30b660a20803232046j6b4dd181r5b86c1a19ba01a1e@mail.gmail.com> Date: Sun, 23 Mar 2008 20:46:48 -0700 From: "Mark Davis" To: "Steven R. Loomis" Subject: Re: fill-in fields Cc: "Chris Hansten" , "John Emmons" , "CLDR list" , "cldr-users@unicode.org" In-Reply-To: <47E1A27C.2040301@icu-project.org> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_7366_2414164.1206330408811" References: <30b660a20803191410q261536ddg54979c89bb33a27@mail.gmail.com> <47E1A27C.2040301@icu-project.org> X-Google-Sender-Auth: b3789229a9f8b943 X-archive-position: 422 X-ecartis-version: Ecartis v1.0.0 Sender: cldr-users-bounce@unicode.org Errors-to: cldr-users-bounce@unicode.org X-original-sender: mark.davis@icu-project.org Precedence: bulk X-list: cldr-users ------=_Part_7366_2414164.1206330408811 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Content-Disposition: inline I added getFillInValue(path) to CLDRFile, and to ConsoleCheck. It appears to work as expected. If you have a chance to try adding it, we could test it out. So what it does is: 1. Winning value (if not from root) 2. Plural fallback 3. Winning value (even if in root) 4. (I dropped #4 (the english value) as that seemed too risky.) Mark On Wed, Mar 19, 2008 at 4:32 PM, Steven R. Loomis wrote: > It would probably be good for someone to start mocking these up. > > Since the clicking part is done with javascript, it could be posted > somewhere as a static page, for testing > > -s > > > Chris Hansten wrote: > > Hmm...I'm still a little worried. If the value is populated when the > > click there, and doing nothing means it remains, and there is no way > > to visually distinguish it from something they entered themselves, I > > can just see too many cases where someone might accidentally keep the > > value. > > > > Can the default value when populated be visually distinguished at the > > point it is populated? For example, could it be bright red, and > > subsequent edits they made to the field cause it to revert to the > > normal black? Then I might not be as concerned. > > > > cheers > > chris > > On Mar 19, 2008, at 2:10 PM, Mark Davis wrote: > > > >> uld happen is that if I click into an *empty* entry field, I'd get > >> the "default" entered as if I'd typed it in. We already automatically > >> set the radio button if there is any type-in in that field, so if I > >> hit Save, it would be entered. If I wanted to reject it, then just > >> like now I'd hit n/a. Or, I can delete everything in the field, since > >> we reject empty text fields. > >> > >> Implementing that would be pretty simple; we could add a function in > >> CLDRFile like getTextEntryFillIn(String path), and the Survey tool > >> would call it if someone clicks in an *empty* text entry field. > >> > >> Your suggestions are interesting, but I'm worried about this getting > >> too complicated and/or taking up too much screen realestate (like to > >> show the source). Do you think that we can tweak what I've described > >> to be useful and yet not risky? > >> > >> Mark > >> > >> > > > > -- Mark ------=_Part_7366_2414164.1206330408811 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 7bit Content-Disposition: inline I added getFillInValue(path) to CLDRFile, and to ConsoleCheck. It appears to work as expected. If you have a chance to try adding it, we could test it out.

So what it does is:
  1. Winning value (if not from root)
  2. Plural fallback
  3. Winning value (even if in root)
  4. (I dropped #4 (the english value) as that seemed too risky.)
Mark

On Wed, Mar 19, 2008 at 4:32 PM, Steven R. Loomis <srl@icu-project.org> wrote:
It would probably be good for someone to start mocking these up.

Since the clicking part is done with javascript, it could be posted
somewhere as a static page, for testing

-s


Chris Hansten wrote:
> Hmm...I'm still a little worried. If the value is populated when the
> click there, and doing nothing means it remains, and there is no way
> to visually distinguish it from something they entered themselves, I
> can just see too many cases where someone might accidentally keep the
> value.
>
> Can the default value when populated be visually distinguished at the
> point it is populated? For example, could it be bright red, and
> subsequent edits they made to the field cause  it to revert to the
> normal black? Then I might not be as concerned.
>
> cheers
> chris
> On Mar 19, 2008, at 2:10 PM, Mark Davis wrote:
>
>> uld happen is that if I click into an *empty* entry field, I'd get
>> the "default" entered as if I'd typed it in. We already automatically
>> set the radio button if there is any type-in in that field, so if I
>> hit Save, it would be entered. If I wanted to reject it, then just
>> like now I'd hit n/a. Or, I can delete everything in the field, since
>> we reject empty text fields.
>>
>> Implementing that would be pretty simple; we could add a function in
>> CLDRFile like getTextEntryFillIn(String path), and the Survey tool
>> would call it if someone clicks in an *empty* text entry field.
>>
>> Your suggestions are interesting, but I'm worried about this getting
>> too complicated and/or taking up too much screen realestate (like to
>> show the source). Do you think that we can tweak what I've described
>> to be useful and yet not risky?
>>
>> Mark
>>
>>
>




--
Mark ------=_Part_7366_2414164.1206330408811-- From v-magdad@microsoft.com Mon Mar 24 16:06:41 2008 Received: with ECARTIS (v1.0.0; list cldr-users); Mon, 24 Mar 2008 16:07:10 -0600 (CST) Received: from smtp.microsoft.com (mail1.microsoft.com [131.107.115.212]) by unicode.org (8.12.11/8.12.11) with ESMTP id m2OM6eVn019473; Mon, 24 Mar 2008 16:06:40 -0600 Received: from TK5-EXHUB-C102.redmond.corp.microsoft.com (157.54.18.53) by TK5-EXGWY-E801.partners.extranet.microsoft.com (10.251.56.50) with Microsoft SMTP Server (TLS) id 8.1.240.5; Mon, 24 Mar 2008 15:07:06 -0700 Received: from NA-EXMSG-C125.redmond.corp.microsoft.com ([157.54.61.83]) by TK5-EXHUB-C102.redmond.corp.microsoft.com ([157.54.18.53]) with mapi; Mon, 24 Mar 2008 15:06:33 -0700 From: "Magda Danish (Unicode)" To: "unicode@unicode.org" Date: Mon, 24 Mar 2008 15:06:42 -0700 Subject: Call for Participation: 32nd Internationalization & Unicode Conference -- San Jose, Calif., USA; September 8-10, 2008 Thread-Topic: Call for Participation: 32nd Internationalization & Unicode Conference -- San Jose, Calif., USA; September 8-10, 2008 Thread-Index: AciN+1eBXqRsL2pzTyCHj2BxG9u+NQ== Message-ID: <871A62EA91884849A3BE952CA63832D016BBE7C215@NA-EXMSG-C125.redmond.corp.microsoft.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: multipart/alternative; boundary="_000_871A62EA91884849A3BE952CA63832D016BBE7C215NAEXMSGC125re_" MIME-Version: 1.0 X-archive-position: 423 X-ecartis-version: Ecartis v1.0.0 Sender: cldr-users-bounce@unicode.org Errors-to: cldr-users-bounce@unicode.org X-original-sender: v-magdad@microsoft.com Precedence: bulk X-list: cldr-users --_000_871A62EA91884849A3BE952CA63832D016BBE7C215NAEXMSGC125re_