From - Fri Aug 01 02:39:37 2003 X-UIDL: <066001c357c7$936abc20$deeefea9@Xerxes> X-Mozilla-Status: 0001 X-Mozilla-Status2: 00000000 Return-Path: Received: from unicode.org ([209.235.17.55]) by mta05-svc.ntlworld.com (InterMail vM.4.01.03.37 201-229-121-137-20020806) with ESMTP id <20030801005529.QJNK25208.mta05-svc.ntlworld.com@unicode.org> for ; Fri, 1 Aug 2003 01:55:29 +0100 Received: from sarasvati.unicode.org (localhost.localdomain [127.0.0.1]) by unicode.org (8.11.6/8.11.6) with ESMTP id h710tJ216499; Thu, 31 Jul 2003 20:55:19 -0400 Received: with ECARTIS (v1.0.0; list hebrew); Thu, 31 Jul 2003 20:55:19 -0400 (EDT) Received: from smtp03.mrf.mail.rcn.net (smtp03.mrf.mail.rcn.net [207.172.4.62]) by unicode.org (8.11.6/8.11.6) with ESMTP id h710tJ216493 for ; Thu, 31 Jul 2003 20:55:19 -0400 Received: from 216-164-48-205.c3-0.gth-ubr1.lnh-gth.md.cable.rcn.com ([216.164.48.205] helo=Xerxes) by smtp03.mrf.mail.rcn.net with smtp (Exim 3.35 #4) id 19iOCU-00005T-00; Thu, 31 Jul 2003 20:55:18 -0400 Message-ID: <066001c357c7$936abc20$deeefea9@Xerxes> From: "Ted Hopp" To: =?iso-8859-1?Q?Karlj=FCrgen_Feuerherm?= , References: <3F2939D7.9010703@smrtytrek.com> <043301c35786$ef2a0b10$deeefea9@Xerxes> <048301c357a0$f74f2950$deeefea9@Xerxes> <20030731211857.GJ9926@skunk.reutershealth.com> <058301c357b0$666d3700$deeefea9@Xerxes> <015401c357c0$02635ef0$24c4fed8@kgfeuerherm> Subject: [hebrew] Re: Hebrew Vav Holam Date: Thu, 31 Jul 2003 20:55:17 -0400 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1158 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1165 X-archive-position: 3 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: ted@newslate.com Precedence: bulk X-list: hebrew On Thursday, July 31, 2003 8:01 PM, Karljürgen Feuerherm wrote: > Ted, > > > Is not U+FB35 HEBREW LETTER VAV WITH DAGESH a shuruq? > > > > Only graphically. Different pronunciation, different names, different > > functions grammatically. Old typewriters used to have only a single key > for > > the lower case letter 'l' and the digit '1'. (Change your font if you > can't > > see the difference.) Sometimes, Unicode is an old typewriter. > > That's not a particularly good analogy, unless you can demonstrate that the > two are at times clearly visually different as well (as are lowercase 'l' > and '1'). If they are not, there's no particularly good reason to create a > new 'shureq'. At most, it might be an argument to create some dot alongside > dagesh, because shureq is based upon the semi-vowel waw just as holem-waw > is. But nobody is asking for a new shuruq. Just a new holam male, which at times is rendered identically to vav with holam, but also has a history of differences in representation (including different shapes to the dot itself, in some cases) dating from 1000 years ago to the present day. Ted Ted Hopp, Ph.D. ZigZag, Inc. ted@newSLATE.com +1-301-990-7453 newSLATE is your personal learning workspace ...on the web at http://www.newSLATE.com/ From - Fri Aug 01 02:39:38 2003 X-UIDL: <018a01c357d3$faff30d0$24c4fed8@kgfeuerherm> X-Mozilla-Status: 0001 X-Mozilla-Status2: 00000000 Return-Path: Received: from unicode.org ([209.235.17.55]) by mta02-svc.ntlworld.com (InterMail vM.4.01.03.37 201-229-121-137-20020806) with ESMTP id <20030801022425.ENKY9049.mta02-svc.ntlworld.com@unicode.org> for ; Fri, 1 Aug 2003 03:24:25 +0100 Received: from sarasvati.unicode.org (localhost.localdomain [127.0.0.1]) by unicode.org (8.11.6/8.11.6) with ESMTP id h712OD221988; Thu, 31 Jul 2003 22:24:13 -0400 Received: with ECARTIS (v1.0.0; list hebrew); Thu, 31 Jul 2003 22:24:13 -0400 (EDT) Received: from fep04-mail.bloor.is.net.cable.rogers.com (fep04-mail.bloor.is.net.cable.rogers.com [66.185.86.74]) by unicode.org (8.11.6/8.11.6) with ESMTP id h712OC221982 for ; Thu, 31 Jul 2003 22:24:12 -0400 Received: from kgfeuerherm ([216.254.195.80]) by fep04-mail.bloor.is.net.cable.rogers.com (InterMail vM.5.01.05.12 201-253-122-126-112-20020820) with ESMTP id <20030801022320.XAWX392870.fep04-mail.bloor.is.net.cable.rogers.com@kgfeuerherm>; Thu, 31 Jul 2003 22:23:20 -0400 Message-ID: <018a01c357d3$faff30d0$24c4fed8@kgfeuerherm> From: =?iso-8859-1?Q?Karlj=FCrgen_Feuerherm?= To: "Ted Hopp" , References: <3F2939D7.9010703@smrtytrek.com> <043301c35786$ef2a0b10$deeefea9@Xerxes> <048301c357a0$f74f2950$deeefea9@Xerxes> <20030731211857.GJ9926@skunk.reutershealth.com> <058301c357b0$666d3700$deeefea9@Xerxes> <015401c357c0$02635ef0$24c4fed8@kgfeuerherm> <066001c357c7$936abc20$deeefea9@Xerxes> Subject: [hebrew] Re: Hebrew Vav Holam Date: Thu, 31 Jul 2003 22:24:04 -0400 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1158 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 X-Authentication-Info: Submitted using SMTP AUTH LOGIN at fep04-mail.bloor.is.net.cable.rogers.com from [216.254.195.80] using ID at Thu, 31 Jul 2003 22:23:17 -0400 X-archive-position: 4 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: cuneiform@rogers.com Precedence: bulk X-list: hebrew All right then, my mistake. I am finding it hard to keep track of all the views and today in particular I was not able to absorb all the messages. K ----- Original Message ----- From: "Ted Hopp" To: "Karljürgen Feuerherm" ; Sent: Thursday, July 31, 2003 8:55 PM Subject: Re: Hebrew Vav Holam > On Thursday, July 31, 2003 8:01 PM, Karljürgen Feuerherm wrote: > > Ted, > > > > Is not U+FB35 HEBREW LETTER VAV WITH DAGESH a shuruq? > > > > > > Only graphically. Different pronunciation, different names, different > > > functions grammatically. Old typewriters used to have only a single key > > for > > > the lower case letter 'l' and the digit '1'. (Change your font if you > > can't > > > see the difference.) Sometimes, Unicode is an old typewriter. > > > > That's not a particularly good analogy, unless you can demonstrate that > the > > two are at times clearly visually different as well (as are lowercase 'l' > > and '1'). If they are not, there's no particularly good reason to create a > > new 'shureq'. At most, it might be an argument to create some dot > alongside > > dagesh, because shureq is based upon the semi-vowel waw just as holem-waw > > is. > > But nobody is asking for a new shuruq. Just a new holam male, which at times > is rendered identically to vav with holam, but also has a history of > differences in representation (including different shapes to the dot itself, > in some cases) dating from 1000 years ago to the present day. > > Ted > > Ted Hopp, Ph.D. > ZigZag, Inc. > ted@newSLATE.com > +1-301-990-7453 > > newSLATE is your personal learning workspace > ...on the web at http://www.newSLATE.com/ > > > From - Fri Aug 01 02:39:39 2003 X-UIDL: <20030801043903.GH451@mercury.ccil.org> X-Mozilla-Status: 0003 X-Mozilla-Status2: 00000000 Return-Path: Received: from unicode.org ([209.235.17.55]) by mta07-svc.ntlworld.com (InterMail vM.4.01.03.37 201-229-121-137-20020806) with ESMTP id <20030801043915.UDJM1286.mta07-svc.ntlworld.com@unicode.org> for ; Fri, 1 Aug 2003 05:39:15 +0100 Received: from sarasvati.unicode.org (localhost.localdomain [127.0.0.1]) by unicode.org (8.11.6/8.11.6) with ESMTP id h714d3225181; Fri, 1 Aug 2003 00:39:03 -0400 Received: with ECARTIS (v1.0.0; list hebrew); Fri, 01 Aug 2003 00:39:03 -0400 (EDT) Received: from mercury.ccil.org (mercury.ccil.org [192.190.237.100]) by unicode.org (8.11.6/8.11.6) with ESMTP id h714d3225175 for ; Fri, 1 Aug 2003 00:39:03 -0400 Received: from cowan by mercury.ccil.org with local (Exim 3.35 #1 (Debian)) id 19iRh1-00012Y-00; Fri, 01 Aug 2003 00:39:03 -0400 Date: Fri, 1 Aug 2003 00:39:03 -0400 To: Ted Hopp Cc: hebrew@unicode.org Subject: [hebrew] Re: Hebrew Vav Holam Message-ID: <20030801043903.GH451@mercury.ccil.org> References: <001701c35792$043a7ed0$0401c80a@QSM4> <046501c3579b$6a1eb2d0$deeefea9@Xerxes> <20030731200437.GF9926@skunk.reutershealth.com> <04d001c357a5$ab52b350$deeefea9@Xerxes> <20030731210618.GI9926@skunk.reutershealth.com> <056c01c357af$3adde5e0$deeefea9@Xerxes> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <056c01c357af$3adde5e0$deeefea9@Xerxes> User-Agent: Mutt/1.3.28i From: John Cowan X-archive-position: 5 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: cowan@mercury.ccil.org Precedence: bulk X-list: hebrew Ted Hopp scripsit: > True, but what about editing? Should a backspace delete both characters or > just one? That's a general problem for lots of different Unicode forms. If you are writing Hausa, say, where some letters require combining marks but others have precomposed forms, we similarly have to be clever about about backspacing. > What is this with a right holam on an alef? There is no such thing. How would you typeset an English sentence like "When with is followed by , Hebrew typography uses a with no vowel sign followed by an ."? All the Hebrew letters appear "unnaturally" in isolation here, and contextual determination will not suffice. > A holam male character wouldn't be decomposable if there weren't a "right > holam" combining mark. If there were a right holam combining mark, there would not be a holam male character. -- MEET US AT POINT ORANGE AT MIDNIGHT BRING YOUR DUCK OR PREPARE TO FACE WUGGUMS John Cowan http://www.reutershealth.com jcowan@reutershealth.com From - Fri Aug 01 02:39:39 2003 X-UIDL: <07c201c357e9$92566d80$deeefea9@Xerxes> X-Mozilla-Status: 0003 X-Mozilla-Status2: 00000000 Return-Path: Received: from unicode.org ([209.235.17.55]) by mta03-svc.ntlworld.com (InterMail vM.4.01.03.37 201-229-121-137-20020806) with ESMTP id <20030801045852.XDRQ14590.mta03-svc.ntlworld.com@unicode.org> for ; Fri, 1 Aug 2003 05:58:52 +0100 Received: from sarasvati.unicode.org (localhost.localdomain [127.0.0.1]) by unicode.org (8.11.6/8.11.6) with ESMTP id h714wf225727; Fri, 1 Aug 2003 00:58:41 -0400 Received: with ECARTIS (v1.0.0; list hebrew); Fri, 01 Aug 2003 00:58:41 -0400 (EDT) Received: from smtp03.mrf.mail.rcn.net (smtp03.mrf.mail.rcn.net [207.172.4.62]) by unicode.org (8.11.6/8.11.6) with ESMTP id h714we225721 for ; Fri, 1 Aug 2003 00:58:40 -0400 Received: from 216-164-48-205.c3-0.gth-ubr1.lnh-gth.md.cable.rcn.com ([216.164.48.205] helo=Xerxes) by smtp03.mrf.mail.rcn.net with smtp (Exim 3.35 #4) id 19iRzz-0003Kd-00; Fri, 01 Aug 2003 00:58:39 -0400 Message-ID: <07c201c357e9$92566d80$deeefea9@Xerxes> From: "Ted Hopp" To: "John Cowan" Cc: References: <001701c35792$043a7ed0$0401c80a@QSM4> <046501c3579b$6a1eb2d0$deeefea9@Xerxes> <20030731200437.GF9926@skunk.reutershealth.com> <04d001c357a5$ab52b350$deeefea9@Xerxes> <20030731210618.GI9926@skunk.reutershealth.com> <056c01c357af$3adde5e0$deeefea9@Xerxes> <20030801043903.GH451@mercury.ccil.org> Subject: [hebrew] Re: Hebrew Vav Holam Date: Fri, 1 Aug 2003 00:58:38 -0400 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1158 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1165 X-archive-position: 6 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: ted@newslate.com Precedence: bulk X-list: hebrew On Friday, August 01, 2003 12:39 AM, John Cowan wrote: > How would you typeset an English sentence like "When with > is followed by , Hebrew typography uses a with no vowel > sign followed by an ."? My goodness, why would I want to? That sentence is completely wrong. :) Ted Ted Hopp, Ph.D. ZigZag, Inc. ted@newSLATE.com +1-301-990-7453 newSLATE is your personal learning workspace ...on the web at http://www.newSLATE.com/ From - Fri Aug 01 02:39:40 2003 X-UIDL: <07ca01c357eb$594522a0$deeefea9@Xerxes> X-Mozilla-Status: 0001 X-Mozilla-Status2: 00000000 Return-Path: Received: from unicode.org ([209.235.17.55]) by mta07-svc.ntlworld.com (InterMail vM.4.01.03.37 201-229-121-137-20020806) with ESMTP id <20030801051129.VLVP1286.mta07-svc.ntlworld.com@unicode.org> for ; Fri, 1 Aug 2003 06:11:29 +0100 Received: from sarasvati.unicode.org (localhost.localdomain [127.0.0.1]) by unicode.org (8.11.6/8.11.6) with ESMTP id h715BO226781; Fri, 1 Aug 2003 01:11:24 -0400 Received: with ECARTIS (v1.0.0; list hebrew); Fri, 01 Aug 2003 01:11:24 -0400 (EDT) Received: from smtp03.mrf.mail.rcn.net (smtp03.mrf.mail.rcn.net [207.172.4.62]) by unicode.org (8.11.6/8.11.6) with ESMTP id h715BO226775 for ; Fri, 1 Aug 2003 01:11:24 -0400 Received: from 216-164-48-205.c3-0.gth-ubr1.lnh-gth.md.cable.rcn.com ([216.164.48.205] helo=Xerxes) by smtp03.mrf.mail.rcn.net with smtp (Exim 3.35 #4) id 19iSCJ-0004WP-00; Fri, 01 Aug 2003 01:11:23 -0400 Message-ID: <07ca01c357eb$594522a0$deeefea9@Xerxes> From: "Ted Hopp" To: "John Cowan" Cc: References: <001701c35792$043a7ed0$0401c80a@QSM4> <046501c3579b$6a1eb2d0$deeefea9@Xerxes> <20030731200437.GF9926@skunk.reutershealth.com> <04d001c357a5$ab52b350$deeefea9@Xerxes> <20030731210618.GI9926@skunk.reutershealth.com> <056c01c357af$3adde5e0$deeefea9@Xerxes> <20030801043903.GH451@mercury.ccil.org> Subject: [hebrew] Re: Hebrew Vav Holam Date: Fri, 1 Aug 2003 01:11:22 -0400 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1158 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1165 X-archive-position: 7 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: ted@newslate.com Precedence: bulk X-list: hebrew On Friday, August 01, 2003 12:39 AM, John Cowan wrote: > Ted Hopp scripsit: > > A holam male character wouldn't be decomposable if there weren't a "right > > holam" combining mark. > > If there were a right holam combining mark, there would not be a holam > male character. But there already is, at least in Hebrew. Just code it up, instead of inventing characters that no Hebrew grammarian ever heard of. Ted Ted Hopp, Ph.D. ZigZag, Inc. ted@newSLATE.com +1-301-990-7453 newSLATE is your personal learning workspace ...on the web at http://www.newSLATE.com/ From - Fri Aug 01 03:09:20 2003 X-UIDL: <3F2A3C3E.3060209@ntlworld.com> X-Mozilla-Status: 0001 X-Mozilla-Status2: 00000000 Return-Path: Received: from unicode.org ([209.235.17.55]) by mta02-svc.ntlworld.com (InterMail vM.4.01.03.37 201-229-121-137-20020806) with ESMTP id <20030801100910.DODK9049.mta02-svc.ntlworld.com@unicode.org> for ; Fri, 1 Aug 2003 11:09:10 +0100 Received: from sarasvati.unicode.org (localhost.localdomain [127.0.0.1]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71A94214471; Fri, 1 Aug 2003 06:09:04 -0400 Received: with ECARTIS (v1.0.0; list hebrew); Fri, 01 Aug 2003 06:09:04 -0400 (EDT) Received: from smtp-out7.blueyonder.co.uk (smtp-out7.blueyonder.co.uk [195.188.213.10]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71A94214465 for ; Fri, 1 Aug 2003 06:09:04 -0400 Received: from ntlworld.com ([80.195.249.86]) by smtp-out7.blueyonder.co.uk with Microsoft SMTPSVC(5.0.2195.5600); Fri, 1 Aug 2003 11:09:02 +0100 Message-ID: <3F2A3C3E.3060209@ntlworld.com> Date: Fri, 01 Aug 2003 03:09:02 -0700 From: Peter Kirk User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4) Gecko/20030624 X-Accept-Language: en-gb, en, en-us, az, ru, tr, he, el, fr, de MIME-Version: 1.0 To: John Cowan CC: Ted Hopp , hebrew@unicode.org Subject: [hebrew] Re: Hebrew Vav Holam References: <001701c35792$043a7ed0$0401c80a@QSM4> <046501c3579b$6a1eb2d0$deeefea9@Xerxes> <20030731200437.GF9926@skunk.reutershealth.com> <04d001c357a5$ab52b350$deeefea9@Xerxes> <20030731210618.GI9926@skunk.reutershealth.com> <056c01c357af$3adde5e0$deeefea9@Xerxes> <20030801043903.GH451@mercury.ccil.org> In-Reply-To: <20030801043903.GH451@mercury.ccil.org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 01 Aug 2003 10:09:02.0805 (UTC) FILETIME=[EEF3C850:01C35814] X-archive-position: 8 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: peter.r.kirk@ntlworld.com Precedence: bulk X-list: hebrew On 31/07/2003 21:39, John Cowan wrote: >Ted Hopp scripsit: > > > >>A holam male character wouldn't be decomposable if there weren't a "right >>holam" combining mark. >> >> > >If there were a right holam combining mark, there would not be a holam >male character. > > > In Latin script, as there is a dotless i and a combining upper dot, your principle seems to have abolished the letter i. Ted's point is surely that Hebrew users think of holam male as an entity, whereas they do not think of alef with right dot as an entity. This seems illogical to me because the underlying principles of holam positioning are identical, just as we would think it illogical if the dot on an i was considered separate but the dot on a j was not considered separate. But we have to allow for what users think is logical. The problem is that different users have different ideas, and so backspacing algorithms are going to be a nightmare. -- Peter Kirk peter.r.kirk@ntlworld.com http://web.onetel.net.uk/~peterkirk/ From - Fri Aug 01 03:19:19 2003 X-UIDL: <3F2A3C9D.4050704@ntlworld.com> X-Mozilla-Status: 0001 X-Mozilla-Status2: 00000000 Return-Path: Received: from unicode.org ([209.235.17.55]) by mta02-svc.ntlworld.com (InterMail vM.4.01.03.37 201-229-121-137-20020806) with ESMTP id <20030801101048.DRBC9049.mta02-svc.ntlworld.com@unicode.org> for ; Fri, 1 Aug 2003 11:10:48 +0100 Received: from sarasvati.unicode.org (localhost.localdomain [127.0.0.1]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71AAd214527; Fri, 1 Aug 2003 06:10:39 -0400 Received: with ECARTIS (v1.0.0; list hebrew); Fri, 01 Aug 2003 06:10:39 -0400 (EDT) Received: from smtp-out3.blueyonder.co.uk (smtp-out3.blueyonder.co.uk [195.188.213.6]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71AAd214521 for ; Fri, 1 Aug 2003 06:10:39 -0400 Received: from ntlworld.com ([80.195.249.86]) by smtp-out3.blueyonder.co.uk with Microsoft SMTPSVC(5.0.2195.5600); Fri, 1 Aug 2003 11:10:38 +0100 Message-ID: <3F2A3C9D.4050704@ntlworld.com> Date: Fri, 01 Aug 2003 03:10:37 -0700 From: Peter Kirk User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4) Gecko/20030624 X-Accept-Language: en-gb, en, en-us, az, ru, tr, he, el, fr, de MIME-Version: 1.0 To: Ted Hopp CC: John Cowan , hebrew@unicode.org Subject: [hebrew] Re: Hebrew Vav Holam References: <001701c35792$043a7ed0$0401c80a@QSM4> <046501c3579b$6a1eb2d0$deeefea9@Xerxes> <20030731200437.GF9926@skunk.reutershealth.com> <04d001c357a5$ab52b350$deeefea9@Xerxes> <20030731210618.GI9926@skunk.reutershealth.com> <056c01c357af$3adde5e0$deeefea9@Xerxes> <20030801043903.GH451@mercury.ccil.org> <07c201c357e9$92566d80$deeefea9@Xerxes> In-Reply-To: <07c201c357e9$92566d80$deeefea9@Xerxes> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 01 Aug 2003 10:10:38.0689 (UTC) FILETIME=[281A8510:01C35815] X-archive-position: 9 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: peter.r.kirk@ntlworld.com Precedence: bulk X-list: hebrew On 31/07/2003 21:58, Ted Hopp wrote: >On Friday, August 01, 2003 12:39 AM, John Cowan wrote: > > >>How would you typeset an English sentence like "When with >>is followed by , Hebrew typography uses a with no vowel >>sign followed by an ."? >> >> > >My goodness, why would I want to? That sentence is completely wrong. :) > > > > Indeed, but I don't think it's one of the principles of Unicode to force people to tell the truth! :-) -- Peter Kirk peter.r.kirk@ntlworld.com http://web.onetel.net.uk/~peterkirk/ From - Fri Aug 01 05:29:22 2003 X-UIDL: <1030801132301.29576@mss6-svc.service.ntlworld.com> X-Mozilla-Status: 0001 X-Mozilla-Status2: 00000000 Return-Path: Received: from unicode.org ([209.235.17.55]) by mta01-svc.ntlworld.com (InterMail vM.4.01.03.37 201-229-121-137-20020806) with ESMTP id <20030801122301.JXEF25721.mta01-svc.ntlworld.com@unicode.org> for ; Fri, 1 Aug 2003 13:23:01 +0100 Received: from sarasvati.unicode.org (localhost.localdomain [127.0.0.1]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71CMm229538; Fri, 1 Aug 2003 08:22:48 -0400 Received: with ECARTIS (v1.0.0; list hebrew); Fri, 01 Aug 2003 08:22:48 -0400 (EDT) Received: from pheidippides.md.chalmers.se (pheidippides.md.chalmers.se [129.16.237.91]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71CMl229532 for ; Fri, 1 Aug 2003 08:22:48 -0400 Received: from chalmers95a69n (elek-4-213.chl.chalmers.se [129.16.214.213]) by pheidippides.md.chalmers.se (8.10.1/8.10.1) with ESMTP id h71CMhB14791; Fri, 1 Aug 2003 14:22:43 +0200 (MET DST) Reply-To: From: "Kent Karlsson" To: "'Peter Kirk'" Cc: Subject: [hebrew] Re: From [b-hebrew] Variant forms of vav with holem Date: Fri, 1 Aug 2003 14:16:32 +0200 Organization: Data och IT, Chalmers University of Technology and Göteborg University Message-ID: <001301c35826$beb1f100$d5d61081@chalmers95a69n> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook, Build 10.0.2627 X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4807.1700 In-Reply-To: <3F29AB95.2010505@ntlworld.com> Importance: Normal X-archive-position: 10 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: kentk@cs.chalmers.se Precedence: bulk X-list: hebrew Peter Kirk quoted and wrote: > >Which my proposal trivially acheives: > >vav-holam: > >some consonant followed by holam male: <[consonant], holam, ZWJ, vav> > > > > > But then you don't need the ZWJ as vav, holam is already > distinct from > holam, vav. Or does it serve some purpose which I haven't grasped? In the ZWJ approach, <[consonant], holam, vav> would put the holam on the preceding consonant. The ZWJ would say "please form a ligature between <[consonant], holam> and ; the conventional "ligature" would be to move the holam a bit over to the vav. In the ZWNJ approach, that "ligature" would be the default, and ZWNJ would be used to ask not to form such a "ligature". > And then if we want a word initial holam male e.g. for a list of glyphs > or just possibly for some other language, I suppose we would have to > encode it something like ZWSP - holam - ZWJ - vav, or use WJ > (2060 word joiner) instead of ZWSP. Or do you have a better suggestion for this? (in the ZWJ approach); the space would also serve to separate the words (the holam could occur just after a control code, like newline, or at the beginning of a string, and then should act as if applied to a space). But from your earlier message I got the impression that holam-vav only occurred after another consonant, and that the vowel mark really (logically) belonged to that preceding consonant. How would you sort holam-vav compared to vav-holam? A problem in distinguishing those two is that holam is ignored at level 1 (according to the Unicode default ordering). But if vowel marks had been significant at level 1, then what? /kent k From - Fri Aug 01 06:44:07 2003 X-UIDL: <3F2A6D8C.3040400@ntlworld.com> X-Mozilla-Status: 0001 X-Mozilla-Status2: 00000000 Return-Path: Received: from unicode.org ([209.235.17.55]) by mta07-svc.ntlworld.com (InterMail vM.4.01.03.37 201-229-121-137-20020806) with ESMTP id <20030801133941.BPQC1286.mta07-svc.ntlworld.com@unicode.org> for ; Fri, 1 Aug 2003 14:39:41 +0100 Received: from sarasvati.unicode.org (localhost.localdomain [127.0.0.1]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71DdQ221773; Fri, 1 Aug 2003 09:39:27 -0400 Received: with ECARTIS (v1.0.0; list hebrew); Fri, 01 Aug 2003 09:39:26 -0400 (EDT) Received: from smtp-out4.blueyonder.co.uk (smtp-out4.blueyonder.co.uk [195.188.213.7]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71DdQ221767 for ; Fri, 1 Aug 2003 09:39:26 -0400 Received: from ntlworld.com ([80.195.249.86]) by smtp-out4.blueyonder.co.uk with Microsoft SMTPSVC(5.0.2195.5600); Fri, 1 Aug 2003 14:39:24 +0100 Message-ID: <3F2A6D8C.3040400@ntlworld.com> Date: Fri, 01 Aug 2003 06:39:24 -0700 From: Peter Kirk User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4) Gecko/20030624 X-Accept-Language: en-gb, en, en-us, az, ru, tr, he, el, fr, de MIME-Version: 1.0 To: kentk@cs.chalmers.se CC: hebrew@unicode.org Subject: [hebrew] Re: From [b-hebrew] Variant forms of vav with holem References: <001301c35826$beb1f100$d5d61081@chalmers95a69n> In-Reply-To: <001301c35826$beb1f100$d5d61081@chalmers95a69n> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 01 Aug 2003 13:39:24.0705 (UTC) FILETIME=[5230E110:01C35832] X-archive-position: 11 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: peter.r.kirk@ntlworld.com Precedence: bulk X-list: hebrew On 01/08/2003 05:16, Kent Karlsson wrote: >Peter Kirk quoted and wrote: > > >>>Which my proposal trivially acheives: >>>vav-holam: >>>some consonant followed by holam male: <[consonant], holam, ZWJ, vav> >>> >>> >>> >>> >>But then you don't need the ZWJ as vav, holam is already >>distinct from >>holam, vav. Or does it serve some purpose which I haven't grasped? >> >> > >In the ZWJ approach, <[consonant], holam, vav> would put the holam >on the preceding consonant. The ZWJ would say "please form a ligature >between <[consonant], holam> and ; the conventional "ligature" >would be to move the holam a bit over to the vav. > >In the ZWNJ approach, that "ligature" would be the default, and ZWNJ >would be used to ask not to form such a "ligature". > > But in that case how do I encode a text if I want to make it dependent on the font whether the holam shifts or not? Well, I'm not really sure that we need to do that as the regular rules for holam shift on to vav seems to be mostly fixed. So we could encode all holam males as holam, ZWJ, vav as you originally suggested. My previous objections were probably too strong. But we would have to realise that holam followed by vav and no other vowel, without ZWJ, is a spelling error, or an anomalous case, as the holam would not be shifted although the rule is that it should be in such a case. In any case, this is no worse than the suggestion of using a special right holam, which has the same objection re spelling errors, and at least doing it this way avoids adding a new character. So the suggestion now is as follows, where C is any base character, / indicates alternatives, accents, dagesh etc may also be inserted, rules apply in order i.e. stop after first match: C holam alef vav dagesh/holam - holam not shifted C holam alef vowel-point - holam not shifted C holam alef - holam shifted on to right of alef C holam C - holam not shifted C holam ZWNJ C - holam not shifted (even if second C is alef, for anomalous cases only) C holam ZWJ C - holam shifted on to right of second C (if second C is vav, this is holam male, other cases are anomalous) By comparison, what I suggested before is: C holam vav/alef vav dagesh/holam - holam not shifted C holam vav/alef vowel-point - holam not shifted C holam vav/alef - holam shifted on to right of vav/alef (with vav, this is holam male) C holam C - holam not shifted C holam ZWNJ C - holam not shifted (even if second C is vav or alef, for anomalous cases only) C holam ZWJ C - holam shifted on to right of second C (for anomalous cases only) These rules are only marginally more complex, as the first three lines are anyway needed for alef, and the advantage of this set is that thousands of ZWJ's don't have to be added to existing texts. Of course we could simplify further if we add thousands more ZWJ's into holam alef sequences, so the rules are simply: C holam C - holam not shifted C holam ZWJ C - holam shifted on to right of second C but this has the disadvantage that it breaks what everyone already agrees as the correct encoding of alef with holam. > > >>And then if we want a word initial holam male e.g. for a list of >> >> >glyphs > > >>or just possibly for some other language, I suppose we would have to >>encode it something like ZWSP - holam - ZWJ - vav, or use WJ >>(2060 word joiner) instead of ZWSP. Or do you have a better suggestion >> >> >for this? > > (in the ZWJ approach); the space would also >serve to separate the words (the holam could occur just after a control >code, like newline, or at the beginning of a string, and then should act >as if >applied to a space). But from your earlier message I got the impression >that >holam-vav only occurred after another consonant, and that the vowel mark >really (logically) belonged to that preceding consonant. > Yes, in connected text. But isolated holam male can occur in dictionaries etc. Your use of space doesn't work at the very beginning of a document or string, but I suppose a zero width space can be used in that rather special case. And a combining character at the beginning of a string isn't actually illegal, just anomalous, so presumably can be used for anomalous cases like this one. > >How would you sort holam-vav compared to vav-holam? A problem >in distinguishing those two is that holam is ignored at level 1 >(according >to the Unicode default ordering). But if vowel marks had been >significant >at level 1, then what? > > We have enough real issues to deal with without raising hypothetical ones! Is there actually a real issue here? I'm not sure. Would X - holam - vav - Y in fact sort differently from X - vav - holam - Y, for any strings X and Y, given that holam and vav are at different levels of significance? -- Peter Kirk peter.r.kirk@ntlworld.com http://web.onetel.net.uk/~peterkirk/ From - Fri Aug 01 07:49:21 2003 X-UIDL: <20030801144016.GN9926@skunk.reutershealth.com> X-Mozilla-Status: 0001 X-Mozilla-Status2: 00000000 Return-Path: Received: from unicode.org ([209.235.17.55]) by mta02-svc.ntlworld.com (InterMail vM.4.01.03.37 201-229-121-137-20020806) with ESMTP id <20030801144239.VXEW9049.mta02-svc.ntlworld.com@unicode.org> for ; Fri, 1 Aug 2003 15:42:39 +0100 Received: from sarasvati.unicode.org (localhost.localdomain [127.0.0.1]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71EgJ204437; Fri, 1 Aug 2003 10:42:19 -0400 Received: with ECARTIS (v1.0.0; list hebrew); Fri, 01 Aug 2003 10:42:19 -0400 (EDT) Received: from mail.reutershealth.com ([65.246.141.36]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71EgJ204431 for ; Fri, 1 Aug 2003 10:42:19 -0400 Received: from skunk.reutershealth.com (mail [65.246.141.36]) by mail.reutershealth.com (Pro-8.9.3/Pro-8.9.3) with SMTP id KAA04819 for ; Fri, 1 Aug 2003 10:38:03 -0400 (EDT) Received: by skunk.reutershealth.com (sSMTP sendmail emulation); Fri, 1 Aug 2003 10:40:17 -0400 Date: Fri, 1 Aug 2003 10:40:17 -0400 From: John Cowan To: hebrew@unicode.org Subject: [hebrew] Holam male and line break Message-ID: <20030801144016.GN9926@skunk.reutershealth.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.1i X-archive-position: 12 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: jcowan@reutershealth.com Precedence: bulk X-list: hebrew What happens when the holam-male stands at the beginning of a line? Anything special? -- All Norstrilians knew what laughter was: John Cowan it was "pleasurable corrigible malfunction". http://www.reutershealth.com --Cordwainer Smith, _Norstrilia_ jcowan@reutershealth.com From - Fri Aug 01 08:09:21 2003 X-UIDL: <000701c35846$30b291f0$0401c80a@QSM4> X-Mozilla-Status: 0001 X-Mozilla-Status2: 00000000 Return-Path: Received: from unicode.org ([209.235.17.55]) by mta05-svc.ntlworld.com (InterMail vM.4.01.03.37 201-229-121-137-20020806) with ESMTP id <20030801150354.KDDF25208.mta05-svc.ntlworld.com@unicode.org> for ; Fri, 1 Aug 2003 16:03:54 +0100 Received: from sarasvati.unicode.org (localhost.localdomain [127.0.0.1]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71F3a208473; Fri, 1 Aug 2003 11:03:36 -0400 Received: with ECARTIS (v1.0.0; list hebrew); Fri, 01 Aug 2003 11:03:36 -0400 (EDT) Received: from mx-out.daemonmail.net (mx-out.daemonmail.net [216.104.160.39]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71F3Y208467 for ; Fri, 1 Aug 2003 11:03:35 -0400 Received: from mx0.emailqueue.net (localhost.daemonmail.net [127.0.0.1]) by mx-out.daemonmail.net (8.9.3p2/8.9.3) with SMTP id IAA74653 for ; Fri, 1 Aug 2003 08:03:33 -0700 (PDT) (envelope-from rosennej@qsm.co.il) Received: from (212.235.83.152 [212.235.83.152]) by mail.qsm.co.il with ESMTP id LPJ0rTP2 Fri, 01 Aug 2003 08:03:31 -0700 (PDT) From: "Jony Rosenne" To: Subject: [hebrew] Re: Holam male and line break Date: Fri, 1 Aug 2003 18:01:36 +0200 Message-ID: <000701c35846$30b291f0$0401c80a@QSM4> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook, Build 10.0.4510 In-Reply-To: <20030801144016.GN9926@skunk.reutershealth.com> X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1165 Importance: Normal X-archive-position: 13 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: rosennej@qsm.co.il Precedence: bulk X-list: hebrew It cannot happen in the beginning of a word. It always follows another letter. Jony > -----Original Message----- > From: hebrew-bounce@unicode.org > [mailto:hebrew-bounce@unicode.org] On Behalf Of John Cowan > Sent: Friday, August 01, 2003 4:40 PM > To: hebrew@unicode.org > Subject: [hebrew] Holam male and line break > > > What happens when the holam-male stands at the beginning of a > line? Anything special? > > -- > All Norstrilians knew what laughter was: John Cowan > it was "pleasurable corrigible malfunction". > http://www.reutershealth.com > --Cordwainer Smith, > _Norstrilia_ jcowan@reutershealth.com > > > From - Fri Aug 01 08:19:21 2003 X-UIDL: <20030801150911.GU9926@skunk.reutershealth.com> X-Mozilla-Status: 0001 X-Mozilla-Status2: 00000000 Return-Path: Received: from unicode.org ([209.235.17.55]) by mta05-svc.ntlworld.com (InterMail vM.4.01.03.37 201-229-121-137-20020806) with ESMTP id <20030801151122.KOYL25208.mta05-svc.ntlworld.com@unicode.org> for ; Fri, 1 Aug 2003 16:11:22 +0100 Received: from sarasvati.unicode.org (localhost.localdomain [127.0.0.1]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71FBAj08858; Fri, 1 Aug 2003 11:11:10 -0400 Received: with ECARTIS (v1.0.0; list hebrew); Fri, 01 Aug 2003 11:11:10 -0400 (EDT) Received: from mail.reutershealth.com ([65.246.141.36]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71FBAj08852 for ; Fri, 1 Aug 2003 11:11:10 -0400 Received: from skunk.reutershealth.com (mail [65.246.141.36]) by mail.reutershealth.com (Pro-8.9.3/Pro-8.9.3) with SMTP id LAA05095; Fri, 1 Aug 2003 11:06:58 -0400 (EDT) Received: by skunk.reutershealth.com (sSMTP sendmail emulation); Fri, 1 Aug 2003 11:09:11 -0400 Date: Fri, 1 Aug 2003 11:09:11 -0400 From: John Cowan To: Jony Rosenne Cc: hebrew@unicode.org Subject: [hebrew] Re: Holam male and line break Message-ID: <20030801150911.GU9926@skunk.reutershealth.com> References: <20030801144016.GN9926@skunk.reutershealth.com> <000701c35846$30b291f0$0401c80a@QSM4> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <000701c35846$30b291f0$0401c80a@QSM4> User-Agent: Mutt/1.4.1i X-archive-position: 14 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: jcowan@reutershealth.com Precedence: bulk X-list: hebrew Jony Rosenne scripsit: > It cannot happen in the beginning of a word. It always follows another > letter. So Hebrew words are never split across a line with a hyphen? Forgive me asking these elementary questions. -- John Cowan www.ccil.org/~cowan www.reutershealth.com jcowan@reutershealth.com "'My young friend, if you do not now, immediately and instantly, pull as hard as ever you can, it is my opinion that your acquaintance in the large-pattern leather ulster' (and by this he meant the Crocodile) 'will jerk you into yonder limpid stream before you can say Jack Robinson.'" --the Bi-Coloured-Python-Rock-Snake From - Fri Aug 01 08:29:22 2003 X-UIDL: <000801c35848$753c1fb0$0401c80a@QSM4> X-Mozilla-Status: 0003 X-Mozilla-Status2: 00000000 Return-Path: Received: from unicode.org ([209.235.17.55]) by mta05-svc.ntlworld.com (InterMail vM.4.01.03.37 201-229-121-137-20020806) with ESMTP id <20030801152032.LDHP25208.mta05-svc.ntlworld.com@unicode.org> for ; Fri, 1 Aug 2003 16:20:32 +0100 Received: from sarasvati.unicode.org (localhost.localdomain [127.0.0.1]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71FJvj09053; Fri, 1 Aug 2003 11:19:57 -0400 Received: with ECARTIS (v1.0.0; list hebrew); Fri, 01 Aug 2003 11:19:57 -0400 (EDT) Received: from mx-out.daemonmail.net (mx-out.daemonmail.net [216.104.160.39]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71FJuj09047 for ; Fri, 1 Aug 2003 11:19:57 -0400 Received: from mx0.emailqueue.net (localhost.daemonmail.net [127.0.0.1]) by mx-out.daemonmail.net (8.9.3p2/8.9.3) with SMTP id IAA87536 for ; Fri, 1 Aug 2003 08:19:48 -0700 (PDT) (envelope-from rosennej@qsm.co.il) Received: from (212.235.83.152 [212.235.83.152]) by mail.qsm.co.il with ESMTP id JlM0ppS2 Fri, 01 Aug 2003 08:19:45 -0700 (PDT) From: "Jony Rosenne" To: Subject: [hebrew] Re: Holam male and line break Date: Fri, 1 Aug 2003 18:17:50 +0200 Message-ID: <000801c35848$753c1fb0$0401c80a@QSM4> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook, Build 10.0.4510 In-Reply-To: <20030801150911.GU9926@skunk.reutershealth.com> X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1165 Importance: Normal X-archive-position: 15 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: rosennej@qsm.co.il Precedence: bulk X-list: hebrew Normally not, except in newspapers, but even if it were to be done not in the middle of a syllable. I don't recall ever seeing pointed Hebrew hyphenated. Jony > -----Original Message----- > From: John Cowan [mailto:jcowan@reutershealth.com] > Sent: Friday, August 01, 2003 5:09 PM > To: Jony Rosenne > Cc: hebrew@unicode.org > Subject: Re: [hebrew] Re: Holam male and line break > > > Jony Rosenne scripsit: > > It cannot happen in the beginning of a word. It always > follows another > > letter. > > So Hebrew words are never split across a line with a hyphen? > > Forgive me asking these elementary questions. > > -- > John Cowan www.ccil.org/~cowan www.reutershealth.com > jcowan@reutershealth.com "'My young friend, if you do not > now, immediately and instantly, pull as hard as ever you can, > it is my opinion that your acquaintance in the large-pattern > leather ulster' (and by this he meant the Crocodile) 'will > jerk you into yonder limpid stream before you can say Jack Robinson.'" > --the Bi-Coloured-Python-Rock-Snake > From - Fri Aug 01 08:39:22 2003 X-UIDL: <083e01c35842$4bd79010$deeefea9@Xerxes> X-Mozilla-Status: 0001 X-Mozilla-Status2: 00000000 Return-Path: Received: from unicode.org ([209.235.17.55]) by mta03-svc.ntlworld.com (InterMail vM.4.01.03.37 201-229-121-137-20020806) with ESMTP id <20030801153358.LHPO14590.mta03-svc.ntlworld.com@unicode.org> for ; Fri, 1 Aug 2003 16:33:58 +0100 Received: from sarasvati.unicode.org (localhost.localdomain [127.0.0.1]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71FXqj09561; Fri, 1 Aug 2003 11:33:52 -0400 Received: with ECARTIS (v1.0.0; list hebrew); Fri, 01 Aug 2003 11:33:52 -0400 (EDT) Received: from smtp03.mrf.mail.rcn.net (smtp03.mrf.mail.rcn.net [207.172.4.62]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71FXqj09555 for ; Fri, 1 Aug 2003 11:33:52 -0400 Received: from 216-164-48-205.c3-0.gth-ubr1.lnh-gth.md.cable.rcn.com ([216.164.48.205] helo=Xerxes) by smtp03.mrf.mail.rcn.net with smtp (Exim 3.35 #4) id 19ibud-0006tr-00; Fri, 01 Aug 2003 11:33:47 -0400 Message-ID: <083e01c35842$4bd79010$deeefea9@Xerxes> From: "Ted Hopp" To: "Jony Rosenne" , References: <000701c35846$30b291f0$0401c80a@QSM4> Subject: [hebrew] Re: Holam male and line break Date: Fri, 1 Aug 2003 11:33:45 -0400 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1158 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1165 X-archive-position: 18 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: ted@newslate.com Precedence: bulk X-list: hebrew On Friday, August 01, 2003 12:01 PM, Jony Rosenne wrote: > It cannot happen in the beginning of a word. It always follows another > letter. Not in the table of Hebrew vowels found at the start of my Hebrew/English dictionaries and Hebrew language textbooks. Ted Ted Hopp, Ph.D. ZigZag, Inc. ted@newSLATE.com +1-301-990-7453 newSLATE is your personal learning workspace ...on the web at http://www.newSLATE.com/ From - Fri Aug 01 09:59:21 2003 X-UIDL: <3F2A99F1.6050703@ntlworld.com> X-Mozilla-Status: 0001 X-Mozilla-Status2: 00000000 Return-Path: Received: from unicode.org ([209.235.17.55]) by mta02-svc.ntlworld.com (InterMail vM.4.01.03.37 201-229-121-137-20020806) with ESMTP id <20030801165028.ELAV9049.mta02-svc.ntlworld.com@unicode.org> for ; Fri, 1 Aug 2003 17:50:28 +0100 Received: from sarasvati.unicode.org (localhost.localdomain [127.0.0.1]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71Gnld04028; Fri, 1 Aug 2003 12:49:52 -0400 Received: with ECARTIS (v1.0.0; list hebrew); Fri, 01 Aug 2003 12:49:46 -0400 (EDT) Received: from smtp-out6.blueyonder.co.uk (smtp-out6.blueyonder.co.uk [195.188.213.9]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71GnLd04011 for ; Fri, 1 Aug 2003 12:49:46 -0400 Received: from ntlworld.com ([80.195.249.86]) by smtp-out6.blueyonder.co.uk with Microsoft SMTPSVC(5.0.2195.5600); Fri, 1 Aug 2003 17:48:49 +0100 Message-ID: <3F2A99F1.6050703@ntlworld.com> Date: Fri, 01 Aug 2003 09:48:49 -0700 From: Peter Kirk User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4) Gecko/20030624 X-Accept-Language: en-gb, en, en-us, az, ru, tr, he, el, fr, de MIME-Version: 1.0 To: hebrew@unicode.org Subject: [hebrew] Re: Holam male and line break References: <000801c35848$753c1fb0$0401c80a@QSM4> In-Reply-To: <000801c35848$753c1fb0$0401c80a@QSM4> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 01 Aug 2003 16:48:49.0820 (UTC) FILETIME=[C853CDC0:01C3584C] X-archive-position: 19 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: peter.r.kirk@ntlworld.com Precedence: bulk X-list: hebrew On 01/08/2003 09:17, Jony Rosenne wrote: >Normally not, except in newspapers, but even if it were to be done not in >the middle of a syllable. > >I don't recall ever seeing pointed Hebrew hyphenated. > >Jony > > > Of course this excludes the use of maqaf. But maqaf is like a hard hyphen, always visible and a line break opportunity after. And at least for spelling rules maqaf counts as a word break, so holam male does not occur after it. As for line initial holam male in e.g. dictionaries, I suppose it could be encoded word joiner - holam - male, as word joiner has taken the functions of zero width no break space. -- Peter Kirk peter.r.kirk@ntlworld.com http://web.onetel.net.uk/~peterkirk/ From - Fri Aug 01 08:29:21 2003 X-UIDL: <1030801162031.29621@mss6-svc.service.ntlworld.com> X-Mozilla-Status: 0003 X-Mozilla-Status2: 00000000 Return-Path: Received: from unicode.org ([209.235.17.55]) by mta05-svc.ntlworld.com (InterMail vM.4.01.03.37 201-229-121-137-20020806) with ESMTP id <20030801152030.LDGD25208.mta05-svc.ntlworld.com@unicode.org> for ; Fri, 1 Aug 2003 16:20:30 +0100 Received: from sarasvati.unicode.org (localhost.localdomain [127.0.0.1]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71FJxj09060; Fri, 1 Aug 2003 11:19:59 -0400 Received: with ECARTIS (v1.0.0; list hebrew); Fri, 01 Aug 2003 11:19:59 -0400 (EDT) Received: from pheidippides.md.chalmers.se (pheidippides.md.chalmers.se [129.16.237.91]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71FJwj09054 for ; Fri, 1 Aug 2003 11:19:58 -0400 Received: from chalmers95a69n (elek-4-213.chl.chalmers.se [129.16.214.213]) by pheidippides.md.chalmers.se (8.10.1/8.10.1) with ESMTP id h71FJgB24098; Fri, 1 Aug 2003 17:19:42 +0200 (MET DST) Reply-To: From: "Kent Karlsson" To: "'Peter Kirk'" Cc: Subject: [hebrew] Re: From [b-hebrew] Variant forms of vav with holem Date: Fri, 1 Aug 2003 17:13:30 +0200 Organization: Data och IT, Chalmers University of Technology and Göteborg University Message-ID: <001701c3583f$77a6f440$d5d61081@chalmers95a69n> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook, Build 10.0.2627 X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4807.1700 In-Reply-To: <3F2A6D8C.3040400@ntlworld.com> Importance: Normal X-archive-position: 16 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: kentk@cs.chalmers.se Precedence: bulk X-list: hebrew > >In the ZWJ approach, <[consonant], holam, vav> would put the holam > >on the preceding consonant. The ZWJ would say "please form a ligature > >between <[consonant], holam> and ; the conventional "ligature" > >would be to move the holam a bit over to the vav. > > > >In the ZWNJ approach, that "ligature" would be the default, and ZWNJ > >would be used to ask not to form such a "ligature". > > > > > But in that case how do I encode a text if I want to make it > dependent on the font whether the holam shifts or not? Well, the way ZWJ (should) work, is that it asks for a "ligature", **if there is one** in the font for those characters. If the font in question does not have a ligature for the characters asked to be "ligated", the text will be rendered without that "ligature". In the ZWNJ approach, <[consonant], holam, vav> may or may not put holam on the vav (font dependent), but <[consonant], holam, ZWNJ, vav> should prevent the "ligature" in any case. ... > So the suggestion now is as follows, where C is any base character, / > indicates alternatives, accents, dagesh etc may also be [complex rules not quoted] I'm not sure I follow all these cases, but they do appear overly complex. > These rules are only marginally more complex, as the first > three lines > are anyway needed for alef, and the advantage of this set is that > thousands of ZWJ's don't have to be added to existing texts. > Of course > we could simplify further if we add thousands more ZWJ's into > holam alef > sequences, so the rules are simply: > > C holam C - holam not shifted > C holam ZWJ C - holam shifted on to right of second C > > but this has the disadvantage that it breaks what everyone already > agrees as the correct encoding of alef with holam. I was thinking of MUCH simpler rules, like: ZWNJ approach: right-holam on the vav in most Hebrew fonts (by "ligature") holam on the C in all Hebrew fonts right-holam on the alef in most Hebrew fonts (by "ligature") holam on the C in all Hebrew fonts ZWJ approach: right-holam on the vav in most Hebrew fonts (by "ligature") holam on the C in most Hebrew fonts (use ZWNJ to be sure) right-holam on the alef in most Hebrew fonts (by "ligature") holam on the C in most Hebrew fonts (use ZWNJ to be sure) These would apply regardless of if there are any vowels on not on the vav/alef. (There could be some additional accent mark applied to the C.) While one can "make sure" that there is no "ligature", it is in all cases font dependent whether there is a ligature, but would be governed by convention rather than (Unicode) rules. > > > >How would you sort holam-vav compared to vav-holam? A problem > >in distinguishing those two is that holam is ignored at level 1 > >(according > >to the Unicode default ordering). But if vowel marks had been > >significant > >at level 1, then what? > > > > > We have enough real issues to deal with without raising hypothetical > ones! Well, I was just trying to establish firmer what was the logical place for the holam in either case... > Is there actually a real issue here? I'm not sure. Would X - > holam - vav - Y in fact sort differently from X - vav - holam - Y, for > any strings X and Y, given that holam and vav are at > different levels of significance? Yes. Without tailoring, the default Unicode collation order is: X - vav - holam - Y X - holam - vav - Y pseudo-key: X' - vav - Y'; X'' - BASE - HOLAM - Y''; X''' - MIN - MIN - Y''' pseudo-key: X' - vav - Y'; X'' - HOLAM - BASE - Y''; X''' - MIN - MIN - Y''' and HOLAM is greater than BASE. /kent k From - Fri Aug 01 08:39:21 2003 X-UIDL: <20030801153143.GB15977@skunk.reutershealth.com> X-Mozilla-Status: 0001 X-Mozilla-Status2: 00000000 Return-Path: Received: from unicode.org ([209.235.17.55]) by mta03-svc.ntlworld.com (InterMail vM.4.01.03.37 201-229-121-137-20020806) with ESMTP id <20030801153356.LHNE14590.mta03-svc.ntlworld.com@unicode.org> for ; Fri, 1 Aug 2003 16:33:56 +0100 Received: from sarasvati.unicode.org (localhost.localdomain [127.0.0.1]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71FXgj09551; Fri, 1 Aug 2003 11:33:42 -0400 Received: with ECARTIS (v1.0.0; list hebrew); Fri, 01 Aug 2003 11:33:42 -0400 (EDT) Received: from mail.reutershealth.com ([65.246.141.36]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71FXgj09545 for ; Fri, 1 Aug 2003 11:33:42 -0400 Received: from skunk.reutershealth.com (mail [65.246.141.36]) by mail.reutershealth.com (Pro-8.9.3/Pro-8.9.3) with SMTP id LAA05317; Fri, 1 Aug 2003 11:29:27 -0400 (EDT) Received: by skunk.reutershealth.com (sSMTP sendmail emulation); Fri, 1 Aug 2003 11:31:43 -0400 Date: Fri, 1 Aug 2003 11:31:43 -0400 From: John Cowan To: Kent Karlsson Cc: hebrew@unicode.org Subject: [hebrew] Re: From [b-hebrew] Variant forms of vav with holem Message-ID: <20030801153143.GB15977@skunk.reutershealth.com> References: <20030801143935.GM9926@skunk.reutershealth.com> <001601c3583e$d8de9bb0$d5d61081@chalmers95a69n> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <001601c3583e$d8de9bb0$d5d61081@chalmers95a69n> User-Agent: Mutt/1.4.1i X-archive-position: 17 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: jcowan@reutershealth.com Precedence: bulk X-list: hebrew Kent Karlsson scripsit: > > The right-shifted > > holam dot shows that the vav is not playing its usual role of consonant, > > but represents a vowel here (a "mater lectionis" in the jargon). > > Isn't it just a silent letter (the holam itself representing the vowel)? That is a plausible reading of the situation, if you are used to think of the vowel signs as simply *the* way to write vowels. Historically, though, vowels were originally written by recycling the consonant letters alef, yod, and vav, which led to ambiguities: when did the letter represent the consonants [?], [j], [w] ~ [v], and when the vowels [a], [i], [u]? The appropriate vowel mark was placed on the letter (and, critically, omitted from the previous letter) to show that the letter was functioning as a vowel. > Peter Kirk has indicated that the holam really belongs to the preceding > consonant, Phonologically, yes. > and that rendering it on the preceding consonant is > perfectly legitimate (though uncommon). Or did I misunderstand him? I'm not sure about this point. -- They do not preach John Cowan that their God will rouse them jcowan@reutershealth.com A little before the nuts work loose. http://www.ccil.org/~cowan They do not teach http://www.reutershealth.com that His Pity allows them --Rudyard Kipling, to drop their job when they damn-well choose. "The Sons of Martha" From - Fri Aug 01 10:13:28 2003 X-UIDL: <3F2A9F87.5030601@ntlworld.com> X-Mozilla-Status: 0001 X-Mozilla-Status2: 00000000 Return-Path: Received: from unicode.org ([209.235.17.55]) by mta07-svc.ntlworld.com (InterMail vM.4.01.03.37 201-229-121-137-20020806) with ESMTP id <20030801171250.SAOL1286.mta07-svc.ntlworld.com@unicode.org> for ; Fri, 1 Aug 2003 18:12:50 +0100 Received: from sarasvati.unicode.org (localhost.localdomain [127.0.0.1]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71HCfd09945; Fri, 1 Aug 2003 13:12:41 -0400 Received: with ECARTIS (v1.0.0; list hebrew); Fri, 01 Aug 2003 13:12:41 -0400 (EDT) Received: from smtp-out3.blueyonder.co.uk (smtp-out3.blueyonder.co.uk [195.188.213.6]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71HCed09939 for ; Fri, 1 Aug 2003 13:12:40 -0400 Received: from ntlworld.com ([80.195.249.86]) by smtp-out3.blueyonder.co.uk with Microsoft SMTPSVC(5.0.2195.5600); Fri, 1 Aug 2003 18:12:39 +0100 Message-ID: <3F2A9F87.5030601@ntlworld.com> Date: Fri, 01 Aug 2003 10:12:39 -0700 From: Peter Kirk User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4) Gecko/20030624 X-Accept-Language: en-gb, en, en-us, az, ru, tr, he, el, fr, de MIME-Version: 1.0 To: kentk@cs.chalmers.se CC: hebrew@unicode.org Subject: [hebrew] Re: From [b-hebrew] Variant forms of vav with holem References: <001701c3583f$77a6f440$d5d61081@chalmers95a69n> In-Reply-To: <001701c3583f$77a6f440$d5d61081@chalmers95a69n> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 01 Aug 2003 17:12:39.0812 (UTC) FILETIME=[1CAB4440:01C35850] X-archive-position: 20 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: peter.r.kirk@ntlworld.com Precedence: bulk X-list: hebrew On 01/08/2003 08:13, Kent Karlsson wrote: > > >>>In the ZWJ approach, <[consonant], holam, vav> would put the holam >>>on the preceding consonant. The ZWJ would say "please form a ligature >>>between <[consonant], holam> and ; the conventional "ligature" >>>would be to move the holam a bit over to the vav. >>> >>>In the ZWNJ approach, that "ligature" would be the default, and ZWNJ >>>would be used to ask not to form such a "ligature". >>> >>> >>> >>> >>But in that case how do I encode a text if I want to make it >>dependent on the font whether the holam shifts or not? >> >> > >Well, the way ZWJ (should) work, is that it asks for a "ligature", **if >there is one** in the font for those characters. If the font in question >does not have a ligature for the characters asked to be "ligated", the >text will be rendered without that "ligature". > >In the ZWNJ approach, <[consonant], holam, vav> may or may not >put holam on the vav (font dependent), but <[consonant], holam, ZWNJ, >vav> should prevent the "ligature" in any case. > > OK, this should do what is necessary. > > >I was thinking of MUCH simpler rules, like: > >ZWNJ approach: > right-holam on the vav in most Hebrew fonts >(by "ligature") > holam on the C in all Hebrew fonts > right-holam on the alef in most Hebrew >fonts (by "ligature") > holam on the C in all Hebrew fonts > >ZWJ approach: > right-holam on the vav in most Hebrew fonts >(by "ligature") > holam on the C in most Hebrew fonts (use >ZWNJ to be sure) > right-holam on the alef in most Hebrew fonts >(by "ligature") > holam on the C in most Hebrew fonts (use >ZWNJ to be sure) > > Your rules are basically my simplified ones plus a ZWJ version. And the objection is the same: this would require recoding of all texts including holam alef as well as holam vav combinations, although there is currently nothing broken with the holam alef combinations which already follow my more complex rules. If it's not broken, don't fix it. And the rules for holam vav also work, as currently implemented in Ezra SIL and SBL Hebrew, also work, although I only know of one text encoded with them. So again no good reason to fix what isn't broken. And there is no good reason to simplify rules which are already clear and already implemented. > >Yes. Without tailoring, the default Unicode collation order is: > >X - vav - holam - Y >X - holam - vav - Y > >pseudo-key: X' - vav - Y'; X'' - BASE - HOLAM - Y''; X''' - MIN - MIN - >Y''' >pseudo-key: X' - vav - Y'; X'' - HOLAM - BASE - Y''; X''' - MIN - MIN - >Y''' > >and HOLAM is greater than BASE. > > Would it be possible to tailor the rules so that the two collated as identical? Sorry, but I don't understand Unicode collation yet. -- Peter Kirk peter.r.kirk@ntlworld.com http://web.onetel.net.uk/~peterkirk/ From - Fri Aug 01 11:46:54 2003 X-UIDL: <08be01c3585c$b84c75c0$deeefea9@Xerxes> X-Mozilla-Status: 0003 X-Mozilla-Status2: 00000000 Return-Path: Received: from unicode.org ([209.235.17.55]) by mta02-svc.ntlworld.com (InterMail vM.4.01.03.37 201-229-121-137-20020806) with ESMTP id <20030801184313.LKQK9049.mta02-svc.ntlworld.com@unicode.org> for ; Fri, 1 Aug 2003 19:43:13 +0100 Received: from sarasvati.unicode.org (localhost.localdomain [127.0.0.1]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71Igvd17640; Fri, 1 Aug 2003 14:42:57 -0400 Received: with ECARTIS (v1.0.0; list hebrew); Fri, 01 Aug 2003 14:42:57 -0400 (EDT) Received: from smtp03.mrf.mail.rcn.net (smtp03.mrf.mail.rcn.net [207.172.4.62]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71Igud17634 for ; Fri, 1 Aug 2003 14:42:56 -0400 Received: from 216-164-48-205.c3-0.gth-ubr1.lnh-gth.md.cable.rcn.com ([216.164.48.205] helo=Xerxes) by smtp03.mrf.mail.rcn.net with smtp (Exim 3.35 #4) id 19ierg-0007Cq-00 for hebrew@unicode.org; Fri, 01 Aug 2003 14:42:56 -0400 Message-ID: <08be01c3585c$b84c75c0$deeefea9@Xerxes> From: "Ted Hopp" To: Subject: [hebrew] Where holam male can occur Date: Fri, 1 Aug 2003 14:42:54 -0400 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1158 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1165 X-archive-position: 21 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: ted@newslate.com Precedence: bulk X-list: hebrew I acknowledge that 99+% of the uses of Hebrew characters is in regular Hebrew (and related languages) text. However, Unicode cannot afford to ignore the small but important remaining uses. Regarding holam male in particular, I want to put forth some principles to guide further discussion. ===================== PRINCIPLE 0. A mechanism is needed to distinguish in Unicode-encoded text between holam male and vav with holam haser. This is called the "holam problem". I hope that this need has been sufficiently established. The principle is NOT that there have to be distinct encodings for the two alternatives (although I believe that will be unavoidable), just that there is a need to distinguish. PRINCIPLE 1. Any solution to the holam problem must be consistent with all regular Hebrew usage. PRINCIPLE 2. Holam male can be the first character of a line or follow any arbitrary character. I hope everyone accepts the dictionary example regarding the start of a line. Here are a few other examples: A. Holam male can follow a space (e.g., in a comma-separated list of Hebrew vowels). B. A maqaf is sometimes used to separate letters of a word for purposes of emphasis (e.g., to draw attention to the spelling). Thus, a holam male can follow a maqaf (WITH a word and line break opportunity after the maqaf). C. I'm looking at a college-level beginning Hebrew text for English speakers that has fill-in-the-blank spelling exercises. Several of them have a holam male immediately following a wide, underlined blank space. If this were done in Unicode, the blank might be represented by OBJECT REPLACEMENT CHARACTER, followed by holam male. D. In one vowel pronunciation table, the holam male follows something that looks like 00D7 (MULTIPLICATION SIGN). E. Holam male can follow a HEBREW PUNCTUATION GERESH in spelling of foreign words (e.g., George => gimel-geresh-holam male (even in unpointed text!)-resh-gimel-geresh). PRINCIPLE 3. Holam exists in Hebrew only as holam haser or holam male. These two cases are mutually exclusive and exhaustive. In particular, there is no "holam alef." This notion came into being as part of this discussion and does not exist in Hebrew grammar. I don't have examples of something that doesn't exist. :) This notion of an alef that has a holam on the right is, I believe, entirely a misinterpretation of a holam haser placed on a preceding consonant and typeset in a way that it appears over the alef. As several examples have shown, a holam haser can go quite far over the next letter, even if the next letter bears a vowel mark of its own. This is most common with a lamed-holam haser sequence, because the lamed ascender "pushes" the holam dot further to the left. (Note that other languages have combining characters whose effects on rendering can extend far from where the combining character logically falls in the text stream. The same effect is at work here.) PRINCIPLE 4. Any solution to the holam problem must be consistent with the preceding principles. This applies, in particular, to solutions that involve non-standard character ordering (putting a combining character logically before its associated base character) or the use of layout control characters. Solutions need to address the distinction between holam male and vav with holam haser; no other uses of holam need be accomodated. PRINCIPLE 5. Any solution must be assessed in terms of the full spectrum of Unicode use: data entry, encoding, rendering, searching, data exchange, editing, collation, etc. PRINCIPLE 6. Ideally, solutions should work transparently with all legacy data. This may not be possible, as legacy data are already encoded with conflicting conventions. =========================== These may or may not be a set of principles to which everyone can agree. But until we develop such a set, we will continue to talk at cross-purposes. Ted Ted Hopp, Ph.D. ZigZag, Inc. ted@newSLATE.com +1-301-990-7453 newSLATE is your personal learning workspace ...on the web at http://www.newSLATE.com/ From - Fri Aug 01 11:46:55 2003 X-UIDL: <1030801194532.29576@mss6-svc.service.ntlworld.com> X-Mozilla-Status: 0001 X-Mozilla-Status2: 00000000 Return-Path: Received: from unicode.org ([209.235.17.55]) by mta02-svc.ntlworld.com (InterMail vM.4.01.03.37 201-229-121-137-20020806) with ESMTP id <20030801184531.LOUG9049.mta02-svc.ntlworld.com@unicode.org> for ; Fri, 1 Aug 2003 19:45:31 +0100 Received: from sarasvati.unicode.org (localhost.localdomain [127.0.0.1]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71Ij4d17686; Fri, 1 Aug 2003 14:45:05 -0400 Received: with ECARTIS (v1.0.0; list hebrew); Fri, 01 Aug 2003 14:45:04 -0400 (EDT) Received: from pheidippides.md.chalmers.se (pheidippides.md.chalmers.se [129.16.237.91]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71Ij4d17680 for ; Fri, 1 Aug 2003 14:45:04 -0400 Received: from chalmers95a69n (dynamic-193-148.dialup.chalmers.se [129.16.193.148]) by pheidippides.md.chalmers.se (8.10.1/8.10.1) with ESMTP id h71Ij0B06730; Fri, 1 Aug 2003 20:45:01 +0200 (MET DST) Reply-To: From: "Kent Karlsson" To: "'Peter Kirk'" Cc: Subject: [hebrew] Re: From [b-hebrew] Variant forms of vav with holem Date: Fri, 1 Aug 2003 20:38:47 +0200 Organization: Data och IT, Chalmers University of Technology and Göteborg University Message-ID: <000201c3585c$25baa3d0$94c11081@chalmers95a69n> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook, Build 10.0.2627 In-Reply-To: <3F2A9F87.5030601@ntlworld.com> X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4807.1700 Importance: Normal X-archive-position: 22 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: kentk@cs.chalmers.se Precedence: bulk X-list: hebrew > >Yes. Without tailoring, the default Unicode collation order is: > > > >X - vav - holam - Y > >X - holam - vav - Y ... > Would it be possible to tailor the rules so that the two collated as > identical? Yes, but why would you want to do that? /kent k From - Fri Aug 01 12:29:22 2003 X-UIDL: <3F2ABD04.1000303@ntlworld.com> X-Mozilla-Status: 0001 X-Mozilla-Status2: 00000000 Return-Path: Received: from unicode.org ([209.235.17.55]) by mta03-svc.ntlworld.com (InterMail vM.4.01.03.37 201-229-121-137-20020806) with ESMTP id <20030801192029.XUWS14590.mta03-svc.ntlworld.com@unicode.org> for ; Fri, 1 Aug 2003 20:20:29 +0100 Received: from sarasvati.unicode.org (localhost.localdomain [127.0.0.1]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71JJhd23112; Fri, 1 Aug 2003 15:19:48 -0400 Received: with ECARTIS (v1.0.0; list hebrew); Fri, 01 Aug 2003 15:19:43 -0400 (EDT) Received: from smtp-out8.blueyonder.co.uk (smtp-out8.blueyonder.co.uk [195.188.213.11]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71JJ2d23096 for ; Fri, 1 Aug 2003 15:19:27 -0400 Received: from ntlworld.com ([80.195.249.86]) by smtp-out8.blueyonder.co.uk with Microsoft SMTPSVC(5.0.2195.5600); Fri, 1 Aug 2003 20:18:29 +0100 Message-ID: <3F2ABD04.1000303@ntlworld.com> Date: Fri, 01 Aug 2003 12:18:28 -0700 From: Peter Kirk User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4) Gecko/20030624 X-Accept-Language: en-gb, en, en-us, az, ru, tr, he, el, fr, de MIME-Version: 1.0 To: Ted Hopp CC: hebrew@unicode.org Subject: [hebrew] Re: Where holam male can occur References: <08be01c3585c$b84c75c0$deeefea9@Xerxes> In-Reply-To: <08be01c3585c$b84c75c0$deeefea9@Xerxes> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 01 Aug 2003 19:18:29.0729 (UTC) FILETIME=[B0C52D10:01C35861] X-archive-position: 23 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: peter.r.kirk@ntlworld.com Precedence: bulk X-list: hebrew On 01/08/2003 11:42, Ted Hopp wrote: >I acknowledge that 99+% of the uses of Hebrew characters is in regular >Hebrew (and related languages) text. However, Unicode cannot afford to >ignore the small but important remaining uses. > >Regarding holam male in particular, I want to put forth some principles to >guide further discussion. > >===================== > >PRINCIPLE 0. A mechanism is needed to distinguish in Unicode-encoded text >between holam male and vav with holam haser. This is called the "holam >problem". > > Agreed. >I hope that this need has been sufficiently established. The principle is >NOT that there have to be distinct encodings for the two alternatives >(although I believe that will be unavoidable), ... > It is almost unavoidable. In principle it is generally possible to disambiguate these two contextually, but this disambiguation (unlike the easier disambiguation between holam male and holam haser FOLLOWED by vav) is difficult, probably too difficult for a rendering engine, and there are some ambiguous cases. >... just that there is a need to >distinguish. > >PRINCIPLE 1. Any solution to the holam problem must be consistent with all >regular Hebrew usage. > > Agreed, but I'm not sure what this means in practice. >PRINCIPLE 2. Holam male can be the first character of a line or follow any >arbitrary character. > > Yes, and followed by any arbitrary character. It may even carry a vowel point as in the divine name as shown in my document and in one of my postings. >I hope everyone accepts the dictionary example regarding the start of a >line. Here are a few other examples: > >A. Holam male can follow a space (e.g., in a comma-separated list of Hebrew >vowels). > >B. A maqaf is sometimes used to separate letters of a word for purposes of >emphasis (e.g., to draw attention to the spelling). Thus, a holam male can >follow a maqaf (WITH a word and line break opportunity after the maqaf). > >C. I'm looking at a college-level beginning Hebrew text for English speakers >that has fill-in-the-blank spelling exercises. Several of them have a holam >male immediately following a wide, underlined blank space. If this were done >in Unicode, the blank might be represented by OBJECT REPLACEMENT CHARACTER, >followed by holam male. > >D. In one vowel pronunciation table, the holam male follows something that >looks like 00D7 (MULTIPLICATION SIGN). > >E. Holam male can follow a HEBREW PUNCTUATION GERESH in spelling of foreign >words (e.g., George => gimel-geresh-holam male (even in unpointed >text!)-resh-gimel-geresh). > > If holam male is to be encoded as a sequence of holam and vav or of holam, ZWJ and vav, it is necessary to define an encoding for all of these uses. Perhaps ZWNJ holam vav or ZWNJ holam ZWJ vav would make it clear that the holam should not be attached to what precedes. But I'm not sure if these are recommended sequences. >PRINCIPLE 3. Holam exists in Hebrew only as holam haser or holam male. These >two cases are mutually exclusive and exhaustive. > >In particular, there is no "holam alef." This notion came into being as part >of this discussion and does not exist in Hebrew grammar. > > When I have written of "holam alef", I have meant either the sequence of characters holam followed by alef, or the case which does exist in which a holam dot is situated above the right side of an alef. I did not intend to imply that this is a logical unit. It is just like when Jony wrote of th or gh in English: these are not units but sequences, and so is what I called "holam alef". >I don't have examples of something that doesn't exist. :) > > But I do! ;-) See the first illustration in my document, right word. This contains alef with holam above its top right. The position of the holam may be arbitrary, but it is correct. >This notion of an alef that has a holam on the right is, I believe, entirely >a misinterpretation of a holam haser placed on a preceding consonant and >typeset in a way that it appears over the alef. As several examples have >shown, a holam haser can go quite far over the next letter, even if the next >letter bears a vowel mark of its own. This is most common with a lamed-holam >haser sequence, because the lamed ascender "pushes" the holam dot further to >the left. (Note that other languages have combining characters whose effects >on rendering can extend far from where the combining character logically >falls in the text stream. The same effect is at work here.) > > But this is not quite a full explanation. Undeniably in some texts, e.g. BHS, holam haser is regularly shifted on to a following alef when this alef is silent i.e. carries no vowel point, but not shifted when the alef is a regular consonant. The exact rules for when holam is shifted are complex and the details depend on the specific typographer. But these are the rules used in BHS, which happen to correspond closely to the rules for formation of holam male. >PRINCIPLE 4. Any solution to the holam problem must be consistent with the >preceding principles. This applies, in particular, to solutions that involve >non-standard character ordering (putting a combining character logically >before its associated base character) or the use of layout control >characters. Solutions need to address the distinction between holam male and >vav with holam haser; no other uses of holam need be accomodated. > > Not agreed. There are many other uses of holam including before alef, after lamed, before shin, after sin etc etc which, at least in some publications, do not follow the simple positioning rules. Any solution to the general issue must accommodate all of these special uses as well as the vav-holam sequence and holam male. We can't produce an adequate standard for Hebrew or any other language with a patchwork of inconsistent fixes to specific issues without seeing the bigger picture. >PRINCIPLE 5. Any solution must be assessed in terms of the full spectrum of >Unicode use: data entry, encoding, rendering, searching, data exchange, >editing, collation, etc. > > Agreed. >PRINCIPLE 6. Ideally, solutions should work transparently with all legacy >data. This may not be possible, as legacy data are already encoded with >conflicting conventions. > > Agreed. >=========================== > >These may or may not be a set of principles to which everyone can agree. But >until we develop such a set, we will continue to talk at cross-purposes. > >Ted > > Agreed. And I think you and I are agreed on almost everything, except that I don't think we can fix this problem in isolation from the more general issues. -- Peter Kirk peter.r.kirk@ntlworld.com http://web.onetel.net.uk/~peterkirk/ From - Fri Aug 01 12:29:23 2003 X-UIDL: <3F2ABDF4.7020705@ntlworld.com> X-Mozilla-Status: 0001 X-Mozilla-Status2: 00000000 Return-Path: Received: from unicode.org ([209.235.17.55]) by mta03-svc.ntlworld.com (InterMail vM.4.01.03.37 201-229-121-137-20020806) with ESMTP id <20030801192240.XYMX14590.mta03-svc.ntlworld.com@unicode.org> for ; Fri, 1 Aug 2003 20:22:40 +0100 Received: from sarasvati.unicode.org (localhost.localdomain [127.0.0.1]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71JMVd23178; Fri, 1 Aug 2003 15:22:31 -0400 Received: with ECARTIS (v1.0.0; list hebrew); Fri, 01 Aug 2003 15:22:31 -0400 (EDT) Received: from smtp-out1.blueyonder.co.uk (smtp-out1.blueyonder.co.uk [195.188.213.4]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71JMUd23172 for ; Fri, 1 Aug 2003 15:22:30 -0400 Received: from ntlworld.com ([80.195.249.86]) by smtp-out1.blueyonder.co.uk with Microsoft SMTPSVC(5.0.2195.5600); Fri, 1 Aug 2003 20:22:29 +0100 Message-ID: <3F2ABDF4.7020705@ntlworld.com> Date: Fri, 01 Aug 2003 12:22:28 -0700 From: Peter Kirk User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4) Gecko/20030624 X-Accept-Language: en-gb, en, en-us, az, ru, tr, he, el, fr, de MIME-Version: 1.0 To: kentk@cs.chalmers.se CC: hebrew@unicode.org Subject: [hebrew] Re: From [b-hebrew] Variant forms of vav with holem References: <000201c3585c$25baa3d0$94c11081@chalmers95a69n> In-Reply-To: <000201c3585c$25baa3d0$94c11081@chalmers95a69n> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 01 Aug 2003 19:22:29.0633 (UTC) FILETIME=[3FC39F10:01C35862] X-archive-position: 24 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: peter.r.kirk@ntlworld.com Precedence: bulk X-list: hebrew On 01/08/2003 11:38, Kent Karlsson wrote: >>>Yes. Without tailoring, the default Unicode collation order is: >>> >>>X - vav - holam - Y >>>X - holam - vav - Y >>> >>> >... > > >>Would it be possible to tailor the rules so that the two collated as >>identical? >> >> > >Yes, but why would you want to do that? > > /kent k > > > > Because in legacy encoded data and some existing Unicode data holam male is encoded as vav followed by holam rather than vice versa. Collating the sequences as identical would minimise the incompatibilities. Or would it? Maybe you know better than me if this would help. Is this done in other cases of incompatible legacy encodings? -- Peter Kirk peter.r.kirk@ntlworld.com http://web.onetel.net.uk/~peterkirk/ From - Fri Aug 01 12:49:22 2003 X-UIDL: <000201c3586d$93f3e4e0$0401c80a@QSM4> X-Mozilla-Status: 0003 X-Mozilla-Status2: 00000000 Return-Path: Received: from unicode.org ([209.235.17.55]) by mta02-svc.ntlworld.com (InterMail vM.4.01.03.37 201-229-121-137-20020806) with ESMTP id <20030801194546.QHGP9049.mta02-svc.ntlworld.com@unicode.org> for ; Fri, 1 Aug 2003 20:45:46 +0100 Received: from sarasvati.unicode.org (localhost.localdomain [127.0.0.1]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71JjYd25358; Fri, 1 Aug 2003 15:45:34 -0400 Received: with ECARTIS (v1.0.0; list hebrew); Fri, 01 Aug 2003 15:45:34 -0400 (EDT) Received: from mx-out.daemonmail.net (mx-out.daemonmail.net [216.104.160.39]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71JjXd25351 for ; Fri, 1 Aug 2003 15:45:33 -0400 Received: from mx0.emailqueue.net (localhost.daemonmail.net [127.0.0.1]) by mx-out.daemonmail.net (8.9.3p2/8.9.3) with SMTP id MAA54980 for ; Fri, 1 Aug 2003 12:45:32 -0700 (PDT) (envelope-from rosennej@qsm.co.il) Received: from (212.235.69.45 [212.235.69.45]) by mail.qsm.co.il with ESMTP id KIE0qMK2 Fri, 01 Aug 2003 12:45:29 -0700 (PDT) From: "Jony Rosenne" To: Subject: [hebrew] Holam Date: Fri, 1 Aug 2003 22:43:33 +0200 Message-ID: <000201c3586d$93f3e4e0$0401c80a@QSM4> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook, Build 10.0.4510 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1165 In-Reply-To: <08be01c3585c$b84c75c0$deeefea9@Xerxes> Importance: Normal Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by unicode.org id h71JjXd25351 X-archive-position: 25 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: rosennej@qsm.co.il Precedence: bulk X-list: hebrew Some comments inline. Additional principles: PRINCIPLE 7. Any solution must not impose on users who do not wish to make the distinction or who may not be aware of it. PRINCIPLE 8. Any solution must allow for the co-existence of both usages, and not impede the full spectrum of Unicode use: data entry, encoding, rendering, searching, data exchange, editing, collation, etc. PRINCIPLE 9. In encoding, editing, searching and collation, distinguishing between the two usages should be optional with the default behavior of not making the distinction. PRINCIPLE 9. An implementation of Hebrew may ignore the distinction, subject to Unicode conformance requirement C10. Jony > -----Original Message----- > From: hebrew-bounce@unicode.org > [mailto:hebrew-bounce@unicode.org] On Behalf Of Ted Hopp > Sent: Friday, August 01, 2003 8:43 PM > To: hebrew@unicode.org > Subject: SPAM: [hebrew] Where holam male can occur > > > I acknowledge that 99+% of the uses of Hebrew characters is > in regular Hebrew (and related languages) text. However, > Unicode cannot afford to ignore the small but important > remaining uses. > > Regarding holam male in particular, I want to put forth some > principles to guide further discussion. > > ===================== > > PRINCIPLE 0. A mechanism is needed to distinguish in > Unicode-encoded text between holam male and vav with holam > haser. This is called the "holam problem". > > I hope that this need has been sufficiently established. The > principle is NOT that there have to be distinct encodings for > the two alternatives (although I believe that will be > unavoidable), just that there is a need to distinguish. > > PRINCIPLE 1. Any solution to the holam problem must be > consistent with all regular Hebrew usage. > > PRINCIPLE 2. Holam male can be the first character of a line > or follow any arbitrary character. > > I hope everyone accepts the dictionary example regarding the > start of a line. Here are a few other examples: JR: This applies to any combining mark in Unicode, in the contexts you note. > > A. Holam male can follow a space (e.g., in a comma-separated > list of Hebrew vowels). > > B. A maqaf is sometimes used to separate letters of a word > for purposes of emphasis (e.g., to draw attention to the > spelling). Thus, a holam male can follow a maqaf (WITH a word > and line break opportunity after the maqaf). > > C. I'm looking at a college-level beginning Hebrew text for > English speakers that has fill-in-the-blank spelling > exercises. Several of them have a holam male immediately > following a wide, underlined blank space. If this were done > in Unicode, the blank might be represented by OBJECT > REPLACEMENT CHARACTER, followed by holam male. > > D. In one vowel pronunciation table, the holam male follows > something that looks like 00D7 (MULTIPLICATION SIGN). > > E. Holam male can follow a HEBREW PUNCTUATION GERESH in > spelling of foreign words (e.g., George => gimel-geresh-holam > male (even in unpointed text!)-resh-gimel-geresh). JR: In unpointed Hebrew texts it will be a Vav: Gimel Geresh Vav resh Gimel Geresh > > PRINCIPLE 3. Holam exists in Hebrew only as holam haser or > holam male. These two cases are mutually exclusive and exhaustive. > > In particular, there is no "holam alef." This notion came > into being as part of this discussion and does not exist in > Hebrew grammar. > > I don't have examples of something that doesn't exist. :) > > This notion of an alef that has a holam on the right is, I > believe, entirely a misinterpretation of a holam haser placed > on a preceding consonant and typeset in a way that it appears > over the alef. As several examples have shown, a holam haser > can go quite far over the next letter, even if the next > letter bears a vowel mark of its own. This is most common > with a lamed-holam haser sequence, because the lamed ascender > "pushes" the holam dot further to the left. (Note that other > languages have combining characters whose effects on > rendering can extend far from where the combining character > logically falls in the text stream. The same effect is at work here.) > > PRINCIPLE 4. Any solution to the holam problem must be > consistent with the preceding principles. This applies, in > particular, to solutions that involve non-standard character > ordering (putting a combining character logically before its > associated base character) or the use of layout control > characters. Solutions need to address the distinction between > holam male and vav with holam haser; no other uses of holam > need be accomodated. > > PRINCIPLE 5. Any solution must be assessed in terms of the > full spectrum of Unicode use: data entry, encoding, > rendering, searching, data exchange, editing, collation, etc. > > PRINCIPLE 6. Ideally, solutions should work transparently > with all legacy data. This may not be possible, as legacy > data are already encoded with conflicting conventions. JR: It should work with legacy data correctly encoded according to Unicode 4.0, which does not make the distinction. > > =========================== > > These may or may not be a set of principles to which everyone > can agree. But until we develop such a set, we will continue > to talk at cross-purposes. > > Ted > > Ted Hopp, Ph.D. > ZigZag, Inc. > ted@newSLATE.com > +1-301-990-7453 > > newSLATE is your personal learning workspace > ...on the web at http://www.newSLATE.com/ > > > > > From - Fri Aug 01 12:59:23 2003 X-UIDL: <000301c3586e$d47d8ab0$0401c80a@QSM4> X-Mozilla-Status: 0003 X-Mozilla-Status2: 00000000 Return-Path: Received: from unicode.org ([209.235.17.55]) by mta02-svc.ntlworld.com (InterMail vM.4.01.03.37 201-229-121-137-20020806) with ESMTP id <20030801195444.QWUZ9049.mta02-svc.ntlworld.com@unicode.org> for ; Fri, 1 Aug 2003 20:54:44 +0100 Received: from sarasvati.unicode.org (localhost.localdomain [127.0.0.1]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71JsVd27954; Fri, 1 Aug 2003 15:54:31 -0400 Received: with ECARTIS (v1.0.0; list hebrew); Fri, 01 Aug 2003 15:54:31 -0400 (EDT) Received: from mx-out.daemonmail.net (mx-out.daemonmail.net [216.104.160.39]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71JsUd27948 for ; Fri, 1 Aug 2003 15:54:30 -0400 Received: from mx0.emailqueue.net (localhost.daemonmail.net [127.0.0.1]) by mx-out.daemonmail.net (8.9.3p2/8.9.3) with SMTP id MAA57674 for ; Fri, 1 Aug 2003 12:54:30 -0700 (PDT) (envelope-from rosennej@qsm.co.il) Received: from (212.235.69.45 [212.235.69.45]) by mail.qsm.co.il with ESMTP id zzE0V4L2 Fri, 01 Aug 2003 12:54:27 -0700 (PDT) From: "Jony Rosenne" To: Subject: [hebrew] Holam Date: Fri, 1 Aug 2003 22:52:31 +0200 Message-ID: <000301c3586e$d47d8ab0$0401c80a@QSM4> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook, Build 10.0.4510 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1165 In-Reply-To: <3F2ABD04.1000303@ntlworld.com> Importance: Normal X-archive-position: 26 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: rosennej@qsm.co.il Precedence: bulk X-list: hebrew > -----Original Message----- > From: hebrew-bounce@unicode.org > [mailto:hebrew-bounce@unicode.org] On Behalf Of Peter Kirk > Sent: Friday, August 01, 2003 9:18 PM > To: Ted Hopp > Cc: hebrew@unicode.org > Subject: [hebrew] Re: Where holam male can occur > ... > > Yes, and followed by any arbitrary character. It may even > carry a vowel > point as in the divine name as shown in my document and in one of my > postings. I don't believe it is a Holam Male, at least not according to Jewish tradition. ... > > > But this is not quite a full explanation. Undeniably in some > texts, e.g. > BHS, holam haser is regularly shifted on to a following alef > when this > alef is silent i.e. carries no vowel point, but not shifted when the > alef is a regular consonant. The exact rules for when holam > is shifted > are complex and the details depend on the specific typographer. But > these are the rules used in BHS, which happen to correspond > closely to > the rules for formation of holam male. Maybe the scribe considered it to be a Holam Male? ... > We can't produce an > adequate > standard for Hebrew or any other language with a patchwork of > inconsistent fixes to specific issues without seeing the > bigger picture. Agreed. ... > > -- > Peter Kirk > peter.r.kirk@ntlworld.com > http://web.onetel.net.uk/~peterkirk/ > > Jony From - Fri Aug 01 13:19:23 2003 X-UIDL: <3F2ACA00.2090808@ntlworld.com> X-Mozilla-Status: 0001 X-Mozilla-Status2: 00000000 Return-Path: Received: from unicode.org ([209.235.17.55]) by mta07-svc.ntlworld.com (InterMail vM.4.01.03.37 201-229-121-137-20020806) with ESMTP id <20030801201403.GDBT1286.mta07-svc.ntlworld.com@unicode.org> for ; Fri, 1 Aug 2003 21:14:03 +0100 Received: from sarasvati.unicode.org (localhost.localdomain [127.0.0.1]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71KDtd30043; Fri, 1 Aug 2003 16:13:55 -0400 Received: with ECARTIS (v1.0.0; list hebrew); Fri, 01 Aug 2003 16:13:55 -0400 (EDT) Received: from smtp-out1.blueyonder.co.uk (smtp-out1.blueyonder.co.uk [195.188.213.4]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71KDsd30037 for ; Fri, 1 Aug 2003 16:13:54 -0400 Received: from ntlworld.com ([80.195.249.86]) by smtp-out1.blueyonder.co.uk with Microsoft SMTPSVC(5.0.2195.5600); Fri, 1 Aug 2003 21:13:53 +0100 Message-ID: <3F2ACA00.2090808@ntlworld.com> Date: Fri, 01 Aug 2003 13:13:52 -0700 From: Peter Kirk User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4) Gecko/20030624 X-Accept-Language: en-gb, en, en-us, az, ru, tr, he, el, fr, de MIME-Version: 1.0 To: hebrew@unicode.org Subject: [hebrew] Re: Holam References: <000201c3586d$93f3e4e0$0401c80a@QSM4> In-Reply-To: <000201c3586d$93f3e4e0$0401c80a@QSM4> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 01 Aug 2003 20:13:53.0621 (UTC) FILETIME=[6DF6D850:01C35869] X-archive-position: 27 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: peter.r.kirk@ntlworld.com Precedence: bulk X-list: hebrew On 01/08/2003 13:43, Jony Rosenne wrote: >Some comments inline. > >Additional principles: > >PRINCIPLE 7. Any solution must not impose on users who do not wish to make >the distinction or who may not be aware of it. > >PRINCIPLE 8. Any solution must allow for the co-existence of both usages, >and not impede the full spectrum of Unicode use: data entry, encoding, >rendering, searching, data exchange, editing, collation, etc. > >PRINCIPLE 9. In encoding, editing, searching and collation, distinguishing >between the two usages should be optional with the default behavior of not >making the distinction. > >PRINCIPLE 9. An implementation of Hebrew may ignore the distinction, subject >to Unicode conformance requirement C10. > >Jony > > Agreed with all of these. Your first principle 9 is basically what I just wrote to Kent. > > >>PRINCIPLE 6. Ideally, solutions should work transparently >>with all legacy data. This may not be possible, as legacy >>data are already encoded with conflicting conventions. >> >> > >JR: It should work with legacy data correctly encoded according to Unicode >4.0, which does not make the distinction. > > Agreed. More precisely, Unicode 4.0 does not specify whether the distinction is to be made. It seems to me that in Unicode 4.0 both and are valid ways to encode holam male, because the standard underspecifies. I hope that later versions of the standard will specify adequately which of these two, or something else, is to be used. -- Peter Kirk peter.r.kirk@ntlworld.com http://web.onetel.net.uk/~peterkirk/ From - Fri Aug 01 13:29:23 2003 X-UIDL: <3F2ACC73.1020207@ntlworld.com> X-Mozilla-Status: 0001 X-Mozilla-Status2: 00000000 Return-Path: Received: from unicode.org ([209.235.17.55]) by mta07-svc.ntlworld.com (InterMail vM.4.01.03.37 201-229-121-137-20020806) with ESMTP id <20030801202442.GYAL1286.mta07-svc.ntlworld.com@unicode.org> for ; Fri, 1 Aug 2003 21:24:42 +0100 Received: from sarasvati.unicode.org (localhost.localdomain [127.0.0.1]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71KOMd30269; Fri, 1 Aug 2003 16:24:22 -0400 Received: with ECARTIS (v1.0.0; list hebrew); Fri, 01 Aug 2003 16:24:22 -0400 (EDT) Received: from smtp-out7.blueyonder.co.uk (smtp-out7.blueyonder.co.uk [195.188.213.10]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71KOMd30263 for ; Fri, 1 Aug 2003 16:24:22 -0400 Received: from ntlworld.com ([80.195.249.86]) by smtp-out7.blueyonder.co.uk with Microsoft SMTPSVC(5.0.2195.5600); Fri, 1 Aug 2003 21:24:21 +0100 Message-ID: <3F2ACC73.1020207@ntlworld.com> Date: Fri, 01 Aug 2003 13:24:19 -0700 From: Peter Kirk User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4) Gecko/20030624 X-Accept-Language: en-gb, en, en-us, az, ru, tr, he, el, fr, de MIME-Version: 1.0 To: Jony Rosenne CC: hebrew@unicode.org Subject: [hebrew] Re: Holam References: <000301c3586e$d47d8ab0$0401c80a@QSM4> In-Reply-To: <000301c3586e$d47d8ab0$0401c80a@QSM4> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 01 Aug 2003 20:24:21.0051 (UTC) FILETIME=[E3F110B0:01C3586A] X-archive-position: 28 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: peter.r.kirk@ntlworld.com Precedence: bulk X-list: hebrew On 01/08/2003 13:52, Jony Rosenne wrote: > >>Yes, and followed by any arbitrary character. It may even >>carry a vowel >>point as in the divine name as shown in my document and in one of my >>postings. >> >> > >I don't believe it is a Holam Male, at least not according to Jewish >tradition. > > OK, but the BHS editor seems to have tried to write a holam male here, and in several other places (and where there is no accent which might displace the holam) e.g. Genesis 9:26, Exodus 3:2, 13:3,9,15 etc etc. So it is not just an accident. I think Ted found similar things in Hebrew renderings of English pronunciation in a dictionary. The more general principle lying behind this specific principle of Ted's is that we mustn't restrict Unicode to representing what is valid Hebrew spelling but must allow it to reproduce irregularities as well, and other languages in Hebrew script. -- Peter Kirk peter.r.kirk@ntlworld.com http://web.onetel.net.uk/~peterkirk/ From - Fri Aug 01 13:46:00 2003 X-UIDL: <1030801214049.29682@mss6-svc.service.ntlworld.com> X-Mozilla-Status: 0001 X-Mozilla-Status2: 00000000 Return-Path: Received: from unicode.org ([209.235.17.55]) by mta07-svc.ntlworld.com (InterMail vM.4.01.03.37 201-229-121-137-20020806) with ESMTP id <20030801204049.ICFR1286.mta07-svc.ntlworld.com@unicode.org> for ; Fri, 1 Aug 2003 21:40:49 +0100 Received: from sarasvati.unicode.org (localhost.localdomain [127.0.0.1]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71KeBd31049; Fri, 1 Aug 2003 16:40:11 -0400 Received: with ECARTIS (v1.0.0; list hebrew); Fri, 01 Aug 2003 16:40:11 -0400 (EDT) Received: from netscape.com (c3po.aoltw.net [64.236.137.25]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71KeBd31043 for ; Fri, 1 Aug 2003 16:40:11 -0400 Received: from dredd.mcom.com (dredd.nscp.aoltw.net [10.169.8.48]) by netscape.com (8.10.0/8.10.0) with ESMTP id h71Ke5D20422 for ; Fri, 1 Aug 2003 13:40:05 -0700 (PDT) Received: from netscape.com ([10.169.97.109]) by dredd.mcom.com (Netscape Messaging Server 4.15) with ESMTP id HIYK2T00.IK5; Fri, 1 Aug 2003 13:40:05 -0700 Message-ID: <3F2AD02C.2060606@netscape.com> Date: Fri, 01 Aug 2003 13:40:12 -0700 From: smontagu@netscape.com (Simon Montagu) User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4) Gecko/20030624 Netscape/7.1 (ax) X-Accept-Language: he, en-us, en MIME-Version: 1.0 To: Peter Kirk CC: hebrew@unicode.org Subject: [hebrew] Re: Holam References: <000301c3586e$d47d8ab0$0401c80a@QSM4> <3F2ACC73.1020207@ntlworld.com> In-Reply-To: <3F2ACC73.1020207@ntlworld.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 29 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: smontagu@netscape.com Precedence: bulk X-list: hebrew Peter Kirk wrote: > On 01/08/2003 13:52, Jony Rosenne wrote: > >> >>> Yes, and followed by any arbitrary character. It may even carry a >>> vowel point as in the divine name as shown in my document and in one >>> of my postings. >>> >> >> >> I don't believe it is a Holam Male, at least not according to Jewish >> tradition. >> >> > OK, but the BHS editor seems to have tried to write a holam male here, > and in several other places (and where there is no accent which might > displace the holam) e.g. Genesis 9:26, Exodus 3:2, 13:3,9,15 etc etc. So > it is not just an accident. I think Ted found similar things in Hebrew > renderings of English pronunciation in a dictionary. The more general > principle lying behind this specific principle of Ted's is that we > mustn't restrict Unicode to representing what is valid Hebrew spelling > but must allow it to reproduce irregularities as well, and other > languages in Hebrew script. > The divine name normally appears in BHS without any holam, right? Is the positioning here in BHS different from in words with holam before consonantal vav? From - Fri Aug 01 13:56:36 2003 X-UIDL: <3F2AD3D1.4090801@ntlworld.com> X-Mozilla-Status: 0001 X-Mozilla-Status2: 00000000 Return-Path: Received: from unicode.org ([209.235.17.55]) by mta05-svc.ntlworld.com (InterMail vM.4.01.03.37 201-229-121-137-20020806) with ESMTP id <20030801205559.HEFD25208.mta05-svc.ntlworld.com@unicode.org> for ; Fri, 1 Aug 2003 21:55:59 +0100 Received: from sarasvati.unicode.org (localhost.localdomain [127.0.0.1]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71Ktld32392; Fri, 1 Aug 2003 16:55:47 -0400 Received: with ECARTIS (v1.0.0; list hebrew); Fri, 01 Aug 2003 16:55:47 -0400 (EDT) Received: from smtp-out4.blueyonder.co.uk (smtp-out4.blueyonder.co.uk [195.188.213.7]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71Ktkd32386 for ; Fri, 1 Aug 2003 16:55:46 -0400 Received: from ntlworld.com ([80.195.249.86]) by smtp-out4.blueyonder.co.uk with Microsoft SMTPSVC(5.0.2195.5600); Fri, 1 Aug 2003 21:55:45 +0100 Message-ID: <3F2AD3D1.4090801@ntlworld.com> Date: Fri, 01 Aug 2003 13:55:45 -0700 From: Peter Kirk User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4) Gecko/20030624 X-Accept-Language: en-gb, en, en-us, az, ru, tr, he, el, fr, de MIME-Version: 1.0 To: Simon Montagu CC: hebrew@unicode.org Subject: [hebrew] Re: Holam References: <000301c3586e$d47d8ab0$0401c80a@QSM4> <3F2ACC73.1020207@ntlworld.com> <3F2AD02C.2060606@netscape.com> In-Reply-To: <3F2AD02C.2060606@netscape.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 01 Aug 2003 20:55:45.0378 (UTC) FILETIME=[47168C20:01C3586F] X-archive-position: 30 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: peter.r.kirk@ntlworld.com Precedence: bulk X-list: hebrew On 01/08/2003 13:40, Simon Montagu wrote: > Peter Kirk wrote: > >> OK, but the BHS editor seems to have tried to write a holam male >> here, and in several other places (and where there is no accent which >> might displace the holam) e.g. Genesis 9:26, Exodus 3:2, 13:3,9,15 >> etc etc. So it is not just an accident. I think Ted found similar >> things in Hebrew renderings of English pronunciation in a dictionary. >> The more general principle lying behind this specific principle of >> Ted's is that we mustn't restrict Unicode to representing what is >> valid Hebrew spelling but must allow it to reproduce irregularities >> as well, and other languages in Hebrew script. >> > > The divine name normally appears in BHS without any holam, right? ... Yes. I have a list of about 76 places in the electronic BHS where there is a holam, though I have only checked the positioning in the printed text of the first few. This if of course a tiny proportion of the thousands of occurrences of the name. > ... Is the positioning here in BHS different from in words with holam > before consonantal vav? > > Yes. In the cases I looked at this holam looks just like holam male i.e. holam above the top right of vav, as in the illustration from Genesis 3:14 in my document. Normal holam before consonantal vav is on the top left of the preceding consonant, as in the illustration from 2 Kings 21:26 (with following resh instead of vav). -- Peter Kirk peter.r.kirk@ntlworld.com http://web.onetel.net.uk/~peterkirk/ From - Fri Aug 01 14:19:21 2003 X-UIDL: <20030801211100.GV15977@skunk.reutershealth.com> X-Mozilla-Status: 0003 X-Mozilla-Status2: 00000000 Return-Path: Received: from unicode.org ([209.235.17.55]) by mta07-svc.ntlworld.com (InterMail vM.4.01.03.37 201-229-121-137-20020806) with ESMTP id <20030801211305.KJWL1286.mta07-svc.ntlworld.com@unicode.org> for ; Fri, 1 Aug 2003 22:13:05 +0100 Received: from sarasvati.unicode.org (localhost.localdomain [127.0.0.1]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71LCxd00447; Fri, 1 Aug 2003 17:12:59 -0400 Received: with ECARTIS (v1.0.0; list hebrew); Fri, 01 Aug 2003 17:12:59 -0400 (EDT) Received: from mail.reutershealth.com ([65.246.141.36]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71LCxd00441 for ; Fri, 1 Aug 2003 17:12:59 -0400 Received: from skunk.reutershealth.com (mail [65.246.141.36]) by mail.reutershealth.com (Pro-8.9.3/Pro-8.9.3) with SMTP id RAA08298 for ; Fri, 1 Aug 2003 17:08:46 -0400 (EDT) Received: by skunk.reutershealth.com (sSMTP sendmail emulation); Fri, 1 Aug 2003 17:11:01 -0400 Date: Fri, 1 Aug 2003 17:11:01 -0400 From: John Cowan To: hebrew@unicode.org Subject: [hebrew] Comments on Peter Kirk's draft Message-ID: <20030801211100.GV15977@skunk.reutershealth.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.1i X-archive-position: 31 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: jcowan@reutershealth.com Precedence: bulk X-list: hebrew Boldfacing third-level headings makes them stand out more than first-level and second-level headings, which is bizarre. The statement in the first paragraph of 3.3 that "apart from the end of a word or before shuruq, exactly one vowel point" etc. should (IIRC) mention holam male next to shuruq. In 3.3.2, the idea of using a second set of vowel points for left-displaced ("second") vowels is mentioned only to be dismissed: no reasons are given for objecting to it. The statement in the first paragraph of 3.4.1 that right meteg, if encoded, should have a combining class less than any vowel point is based on a misunderstanding. The whole point of encoding combining marks with different combining classes is to make their ordering irrelevant. It is all one whether a renderer gets a, dot-above, dot-below or a, dot-below, dot-above, though the latter happens to be canonical. Similarly, meteg-vowel and vowel-meteg should render the meteg to the left of the vowel not because of the canonical classes but because that is where it goes. Ditto for right-meteg-vowel and vowel-right-meteg. The combining classes are a convention to define what normalization is, not a hack to assist renderers with a rule like "normalize, then render right to left and all will be well". -- "Do I contradict myself? John Cowan Very well then, I contradict myself. jcowan@reutershealth.com I am large, I contain multitudes. http://www.ccil.org/~cowan --Walt Whitman, _Leaves of Grass_ http://www.reutershealth.com From - Fri Aug 01 14:55:01 2003 X-UIDL: <3F2AE059.2060006@ntlworld.com> X-Mozilla-Status: 0001 X-Mozilla-Status2: 00000000 Return-Path: Received: from unicode.org ([209.235.17.55]) by mta01-svc.ntlworld.com (InterMail vM.4.01.03.37 201-229-121-137-20020806) with ESMTP id <20030801215105.WAFD25721.mta01-svc.ntlworld.com@unicode.org> for ; Fri, 1 Aug 2003 22:51:05 +0100 Received: from sarasvati.unicode.org (localhost.localdomain [127.0.0.1]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71Lo0d01263; Fri, 1 Aug 2003 17:50:05 -0400 Received: with ECARTIS (v1.0.0; list hebrew); Fri, 01 Aug 2003 17:50:00 -0400 (EDT) Received: from smtp-out8.blueyonder.co.uk (smtp-out8.blueyonder.co.uk [195.188.213.11]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71Lnid01241 for ; Fri, 1 Aug 2003 17:49:55 -0400 Received: from ntlworld.com ([80.195.249.86]) by smtp-out8.blueyonder.co.uk with Microsoft SMTPSVC(5.0.2195.5600); Fri, 1 Aug 2003 22:49:13 +0100 Message-ID: <3F2AE059.2060006@ntlworld.com> Date: Fri, 01 Aug 2003 14:49:13 -0700 From: Peter Kirk User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4) Gecko/20030624 X-Accept-Language: en-gb, en, en-us, az, ru, tr, he, el, fr, de MIME-Version: 1.0 To: John Cowan CC: hebrew@unicode.org Subject: [hebrew] Re: Comments on Peter Kirk's draft References: <20030801211100.GV15977@skunk.reutershealth.com> In-Reply-To: <20030801211100.GV15977@skunk.reutershealth.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 01 Aug 2003 21:49:13.0725 (UTC) FILETIME=[BF6972D0:01C35876] X-archive-position: 32 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: peter.r.kirk@ntlworld.com Precedence: bulk X-list: hebrew On 01/08/2003 14:11, John Cowan wrote: >Boldfacing third-level headings makes them stand out more than first-level >and second-level headings, which is bizarre. > > Microsoft standard styles, believe it or not. I've just changed them to italic instead of bold, I hope that is better. >The statement in the first paragraph of 3.3 that "apart from the end >of a word or before shuruq, exactly one vowel point" etc. should (IIRC) >mention holam male next to shuruq. > > OK, depending on what you consider holam male to be. I have changed the wording here. >In 3.3.2, the idea of using a second set of vowel points for >left-displaced ("second") vowels is mentioned only to be dismissed: >no reasons are given for objecting to it. > > Good point. Although I don't much like this proposal, I'm not quite sure why. I know Ted objected to it, and it does seem wasteful to define so many new characters for so few occurrences. I have deleted the sentence "All of these proposals have serious drawbacks." and added at the end of the paragraph "The proposal to change the /vowel points/ in all existing texts has the serious drawback that it invalidates all existing pointed Hebrew texts. While there are no serious problems with using a second set of /vowel points/ in only these rather few anomalous cases, it does seem to be an inefficient solution to define so many new characters which will be used so rarely." >The statement in the first paragraph of 3.4.1 that right meteg, if >encoded, should have a combining class less than any vowel point is >based on a misunderstanding. The whole point of encoding combining marks >with different combining classes is to make their ordering irrelevant. > > Understood. I have changed this to "with a canonical combining class different from that of any /vowel point/.", with a footnote "For efficient rendering, this combining class should be less than that of any /vowel point/." For it does make things easier for font implementers if the canonical order is close to the logical order of rendering. I know that implementation of Hebrew fonts has been seriously complicated by this mismatch. See section F4 of http://scripts.sil.org/cms/sites/nrsi/media/BibHebAltCharsProposal.pdf - I'm not agreeing with this proposal, just pointing to the strong opinions expressed there. >It is all one whether a renderer gets a, dot-above, dot-below or a, >dot-below, dot-above, though the latter happens to be canonical. >Similarly, meteg-vowel and vowel-meteg should render the meteg to the >left of the vowel not because of the canonical classes but because that >is where it goes. Ditto for right-meteg-vowel and vowel-right-meteg. > >The combining classes are a convention to define what normalization is, >not a hack to assist renderers with a rule like "normalize, then render >right to left and all will be well". > > > Thank you for your help. I will wait a bit longer for further comments and then post a second draft. -- Peter Kirk peter.r.kirk@ntlworld.com http://web.onetel.net.uk/~peterkirk/ From - Fri Aug 01 16:06:34 2003 X-UIDL: <005101c35880$e7cd5e80$9ec3fed8@kgfeuerherm> X-Mozilla-Status: 0001 X-Mozilla-Status2: 00000000 Return-Path: Received: from unicode.org ([209.235.17.55]) by mta05-svc.ntlworld.com (InterMail vM.4.01.03.37 201-229-121-137-20020806) with ESMTP id <20030801230222.OSDM25208.mta05-svc.ntlworld.com@unicode.org> for ; Sat, 2 Aug 2003 00:02:22 +0100 Received: from sarasvati.unicode.org (localhost.localdomain [127.0.0.1]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71N24d02837; Fri, 1 Aug 2003 19:02:04 -0400 Received: with ECARTIS (v1.0.0; list hebrew); Fri, 01 Aug 2003 19:02:04 -0400 (EDT) Received: from fep04-mail.bloor.is.net.cable.rogers.com (fep04-mail.bloor.is.net.cable.rogers.com [66.185.86.74]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71N23d02831 for ; Fri, 1 Aug 2003 19:02:03 -0400 Received: from kgfeuerherm ([216.254.195.158]) by fep04-mail.bloor.is.net.cable.rogers.com (InterMail vM.5.01.05.12 201-253-122-126-112-20020820) with ESMTP id <20030801230111.EGWE392870.fep04-mail.bloor.is.net.cable.rogers.com@kgfeuerherm> for ; Fri, 1 Aug 2003 19:01:11 -0400 Message-ID: <005101c35880$e7cd5e80$9ec3fed8@kgfeuerherm> From: =?iso-8859-1?Q?Karlj=FCrgen_Feuerherm?= To: Subject: [hebrew] Fw: Holam Date: Fri, 1 Aug 2003 19:01:55 -0400 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1158 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 X-Authentication-Info: Submitted using SMTP AUTH LOGIN at fep04-mail.bloor.is.net.cable.rogers.com from [216.254.195.158] using ID at Fri, 1 Aug 2003 19:01:09 -0400 X-archive-position: 33 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: cuneiform@rogers.com Precedence: bulk X-list: hebrew (Sorry Mark accidentally posted to old list) > > Yes, and followed by any arbitrary character. It may even > > carry a vowel > > point as in the divine name as shown in my document and in one of my > > postings. > > I don't believe it is a Holam Male, at least not according to Jewish > tradition. I agree. It may be necessary to treat this following establishing the other general principles if it is not covered by them. K From - Fri Aug 01 16:09:25 2003 X-UIDL: <3F2AF2D8.5020200@netscape.com> X-Mozilla-Status: 0003 X-Mozilla-Status2: 00000000 Return-Path: Received: from unicode.org ([209.235.17.55]) by mta05-svc.ntlworld.com (InterMail vM.4.01.03.37 201-229-121-137-20020806) with ESMTP id <20030801230819.OZXD25208.mta05-svc.ntlworld.com@unicode.org> for ; Sat, 2 Aug 2003 00:08:19 +0100 Received: from sarasvati.unicode.org (localhost.localdomain [127.0.0.1]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71N88d03010; Fri, 1 Aug 2003 19:08:08 -0400 Received: with ECARTIS (v1.0.0; list hebrew); Fri, 01 Aug 2003 19:08:08 -0400 (EDT) Received: from netscape.com (r2d2.aoltw.net [64.236.137.26]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71N87d03004 for ; Fri, 1 Aug 2003 19:08:07 -0400 Received: from dredd.mcom.com (dredd.nscp.aoltw.net [10.169.8.48]) by netscape.com (8.10.0/8.10.0) with ESMTP id h71N7si21823 for ; Fri, 1 Aug 2003 16:07:54 -0700 (PDT) Received: from netscape.com ([10.169.97.109]) by dredd.mcom.com (Netscape Messaging Server 4.15) with ESMTP id HIYQXC00.BMB for ; Fri, 1 Aug 2003 16:08:00 -0700 Message-ID: <3F2AF2D8.5020200@netscape.com> Date: Fri, 01 Aug 2003 16:08:08 -0700 From: smontagu@netscape.com (Simon Montagu) User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4) Gecko/20030624 Netscape/7.1 (ax) X-Accept-Language: he, en-us, en MIME-Version: 1.0 To: hebrew@unicode.org Subject: [hebrew] [Peter Kirk's draft] 3.8 Inverted nun Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 34 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: smontagu@netscape.com Precedence: bulk X-list: hebrew I hope that the glyph for inverted nun from BHS (http://www.qaya.org/academic/hebrew/Issues-Hebrew-Unicode_html_m7ff26f06.jpg in section 3.8 of Peter's draft) never acquires any kind of pseudo-canonical status. It is an obvious make-shift, looking as if it was hacked down from a final mem. Yannis Haralambous already describes it as "not very satisfactory" in his 1994 paper, which just goes to show that Yannis Haralambous is more polite than I am: I would say it stinks. The correct form is simply a regular nun inverted horizontally. Simon From - Fri Aug 01 16:09:25 2003 X-UIDL: <3F2AF2E2.4090902@netscape.com> X-Mozilla-Status: 0003 X-Mozilla-Status2: 00000000 Return-Path: Received: from unicode.org ([209.235.17.55]) by mta05-svc.ntlworld.com (InterMail vM.4.01.03.37 201-229-121-137-20020806) with ESMTP id <20030801230830.PADN25208.mta05-svc.ntlworld.com@unicode.org> for ; Sat, 2 Aug 2003 00:08:30 +0100 Received: from sarasvati.unicode.org (localhost.localdomain [127.0.0.1]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71N8Od03027; Fri, 1 Aug 2003 19:08:24 -0400 Received: with ECARTIS (v1.0.0; list hebrew); Fri, 01 Aug 2003 19:08:24 -0400 (EDT) Received: from netscape.com (c3po.aoltw.net [64.236.137.25]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71N8Od03021 for ; Fri, 1 Aug 2003 19:08:24 -0400 Received: from dredd.mcom.com (dredd.nscp.aoltw.net [10.169.8.48]) by netscape.com (8.10.0/8.10.0) with ESMTP id h71N8AD04109 for ; Fri, 1 Aug 2003 16:08:10 -0700 (PDT) Received: from netscape.com ([10.169.97.109]) by dredd.mcom.com (Netscape Messaging Server 4.15) with ESMTP id HIYQXM00.JOS for ; Fri, 1 Aug 2003 16:08:10 -0700 Message-ID: <3F2AF2E2.4090902@netscape.com> Date: Fri, 01 Aug 2003 16:08:18 -0700 From: smontagu@netscape.com (Simon Montagu) User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4) Gecko/20030624 Netscape/7.1 (ax) X-Accept-Language: he, en-us, en MIME-Version: 1.0 To: hebrew@unicode.org Subject: [hebrew] [Peter Kirk's draft] 3.9 Unusual letter forms Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 35 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: smontagu@netscape.com Precedence: bulk X-list: hebrew > In a number of places in the Hebrew Bible unusual letter forms occur. > These include enlarged letters, reduced letters, raised letters and > "broken" letters. These are semantically variant forms of the letters, > and so they do not require specific representation in Unicode. These > variations may be indicated by higher level formatting. I would divide these forms into two classes: enlarged, reduced and raised letters on the the one hand and unusual letter forms on the other. The first class can easily be represented by higher-level markup, but the second class, broken vav and joined qof (are there any others?) require alternate glyphs and IMHO deserve their own codepoints just as inverted nun does. Simon From - Fri Aug 01 16:29:25 2003 X-UIDL: <3F2AF56B.7050202@ntlworld.com> X-Mozilla-Status: 0001 X-Mozilla-Status2: 00000000 Return-Path: Received: from unicode.org ([209.235.17.55]) by mta05-svc.ntlworld.com (InterMail vM.4.01.03.37 201-229-121-137-20020806) with ESMTP id <20030801232027.POKS25208.mta05-svc.ntlworld.com@unicode.org> for ; Sat, 2 Aug 2003 00:20:27 +0100 Received: from sarasvati.unicode.org (localhost.localdomain [127.0.0.1]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71NKBd03318; Fri, 1 Aug 2003 19:20:11 -0400 Received: with ECARTIS (v1.0.0; list hebrew); Fri, 01 Aug 2003 19:20:11 -0400 (EDT) Received: from smtp-out2.blueyonder.co.uk (smtp-out2.blueyonder.co.uk [195.188.213.5]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71NJed03311 for ; Fri, 1 Aug 2003 19:20:06 -0400 Received: from ntlworld.com ([80.195.249.86]) by smtp-out2.blueyonder.co.uk with Microsoft SMTPSVC(5.0.2195.5600); Sat, 2 Aug 2003 00:19:06 +0100 Message-ID: <3F2AF56B.7050202@ntlworld.com> Date: Fri, 01 Aug 2003 16:19:07 -0700 From: Peter Kirk User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4) Gecko/20030624 X-Accept-Language: en-gb, en, en-us, az, ru, tr, he, el, fr, de MIME-Version: 1.0 To: Simon Montagu CC: hebrew@unicode.org Subject: [hebrew] Re: [Peter Kirk's draft] 3.8 Inverted nun References: <3F2AF2D8.5020200@netscape.com> In-Reply-To: <3F2AF2D8.5020200@netscape.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 01 Aug 2003 23:19:06.0883 (UTC) FILETIME=[4DFC0D30:01C35883] X-archive-position: 36 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: peter.r.kirk@ntlworld.com Precedence: bulk X-list: hebrew On 01/08/2003 16:08, Simon Montagu wrote: > > I hope that the glyph for inverted nun from BHS > (http://www.qaya.org/academic/hebrew/Issues-Hebrew-Unicode_html_m7ff26f06.jpg > in section 3.8 of Peter's draft) never acquires any kind of > pseudo-canonical status. It is an obvious make-shift, looking as if it > was hacked down from a final mem. Yannis Haralambous already describes > it as "not very satisfactory" in his 1994 paper, which just goes to > show that Yannis Haralambous is more polite than I am: I would say it > stinks. > > The correct form is simply a regular nun inverted horizontally. > Thanks for the clarification. With a dot or without? PS Are you the same Simon Montagu who has been responding to my Hebrew related bugs (e.g. http://bugzilla.mozilla.org/show_bug.cgi?id=60546) on Bugzilla? Coincidence, or did you find your way here from the sample data in my bug reports? Mozilla is doing a pretty good job with displaying the Hebrew examples, thanks to you among others I'm sure. -- Peter Kirk peter.r.kirk@ntlworld.com http://web.onetel.net.uk/~peterkirk/ From - Fri Aug 01 16:29:25 2003 X-UIDL: <3F2AF604.3030403@ntlworld.com> X-Mozilla-Status: 0003 X-Mozilla-Status2: 00000000 Return-Path: Received: from unicode.org ([209.235.17.55]) by mta05-svc.ntlworld.com (InterMail vM.4.01.03.37 201-229-121-137-20020806) with ESMTP id <20030801232204.PQQZ25208.mta05-svc.ntlworld.com@unicode.org> for ; Sat, 2 Aug 2003 00:22:04 +0100 Received: from sarasvati.unicode.org (localhost.localdomain [127.0.0.1]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71NLid03352; Fri, 1 Aug 2003 19:21:44 -0400 Received: with ECARTIS (v1.0.0; list hebrew); Fri, 01 Aug 2003 19:21:44 -0400 (EDT) Received: from smtp-out8.blueyonder.co.uk (smtp-out8.blueyonder.co.uk [195.188.213.11]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71NLid03346 for ; Fri, 1 Aug 2003 19:21:44 -0400 Received: from ntlworld.com ([80.195.249.86]) by smtp-out8.blueyonder.co.uk with Microsoft SMTPSVC(5.0.2195.5600); Sat, 2 Aug 2003 00:21:40 +0100 Message-ID: <3F2AF604.3030403@ntlworld.com> Date: Fri, 01 Aug 2003 16:21:40 -0700 From: Peter Kirk User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4) Gecko/20030624 X-Accept-Language: en-gb, en, en-us, az, ru, tr, he, el, fr, de MIME-Version: 1.0 To: Simon Montagu CC: hebrew@unicode.org Subject: [hebrew] Re: [Peter Kirk's draft] 3.9 Unusual letter forms References: <3F2AF2E2.4090902@netscape.com> In-Reply-To: <3F2AF2E2.4090902@netscape.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 01 Aug 2003 23:21:40.0885 (UTC) FILETIME=[A9C6E450:01C35883] X-archive-position: 37 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: peter.r.kirk@ntlworld.com Precedence: bulk X-list: hebrew On 01/08/2003 16:08, Simon Montagu wrote: > > In a number of places in the Hebrew Bible unusual letter forms occur. > > These include enlarged letters, reduced letters, raised letters and > > "broken" letters. These are semantically variant forms of the letters, > > and so they do not require specific representation in Unicode. These > > variations may be indicated by higher level formatting. > > I would divide these forms into two classes: enlarged, reduced and > raised letters on the the one hand and unusual letter forms on the other. > > The first class can easily be represented by higher-level markup, but > the second class, broken vav and joined qof (are there any others?) > require alternate glyphs and IMHO deserve their own codepoints just as > inverted nun does. > > Simon > > > Good point. I'll make a change to the document for this. I'll also add a note about the inverted nun glyph. -- Peter Kirk peter.r.kirk@ntlworld.com http://web.onetel.net.uk/~peterkirk/ From - Fri Aug 01 16:39:25 2003 X-UIDL: <3F2AF9D9.3060507@ntlworld.com> X-Mozilla-Status: 0001 X-Mozilla-Status2: 00000000 Return-Path: Received: from unicode.org ([209.235.17.55]) by mta03-svc.ntlworld.com (InterMail vM.4.01.03.37 201-229-121-137-20020806) with ESMTP id <20030801233814.JZLY14590.mta03-svc.ntlworld.com@unicode.org> for ; Sat, 2 Aug 2003 00:38:14 +0100 Received: from sarasvati.unicode.org (localhost.localdomain [127.0.0.1]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71Nc8d03891; Fri, 1 Aug 2003 19:38:08 -0400 Received: with ECARTIS (v1.0.0; list hebrew); Fri, 01 Aug 2003 19:38:08 -0400 (EDT) Received: from smtp-out1.blueyonder.co.uk (smtp-out1.blueyonder.co.uk [195.188.213.4]) by unicode.org (8.11.6/8.11.6) with ESMTP id h71Nc8d03885 for ; Fri, 1 Aug 2003 19:38:08 -0400 Received: from ntlworld.com ([80.195.249.86]) by smtp-out1.blueyonder.co.uk with Microsoft SMTPSVC(5.0.2195.5600); Sat, 2 Aug 2003 00:38:01 +0100 Message-ID: <3F2AF9D9.3060507@ntlworld.com> Date: Fri, 01 Aug 2003 16:38:01 -0700 From: Peter Kirk User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4) Gecko/20030624 X-Accept-Language: en-gb, en, en-us, az, ru, tr, he, el, fr, de MIME-Version: 1.0 To: Simon Montagu CC: hebrew@unicode.org Subject: [hebrew] Re: [Peter Kirk's draft] 3.9 Unusual letter forms References: <3F2AF2E2.4090902@netscape.com> <3F2AF604.3030403@ntlworld.com> In-Reply-To: <3F2AF604.3030403@ntlworld.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 01 Aug 2003 23:38:01.0527 (UTC) FILETIME=[F248FC70:01C35885] X-archive-position: 38 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: peter.r.kirk@ntlworld.com Precedence: bulk X-list: hebrew On 01/08/2003 16:21, Peter Kirk wrote: > On 01/08/2003 16:08, Simon Montagu wrote: > >> > In a number of places in the Hebrew Bible unusual letter forms occur. >> > These include enlarged letters, reduced letters, raised letters and >> > "broken" letters. These are semantically variant forms of the letters, >> > and so they do not require specific representation in Unicode. These >> > variations may be indicated by higher level formatting. >> >> I would divide these forms into two classes: enlarged, reduced and >> raised letters on the the one hand and unusual letter forms on the >> other. >> >> The first class can easily be represented by higher-level markup, but >> the second class, broken vav and joined qof (are there any others?) >> require alternate glyphs and IMHO deserve their own codepoints just >> as inverted nun does. >> >> Simon >> >> >> > Good point. I'll make a change to the document for this. I'll also add > a note about the inverted nun glyph. > On further reflection, I'm not sure that I agree. A broken vav is a vav, and a joined qof is a qof. They should be treated identically in searches etc. Perhaps we need something like a variation selector here. Would that be an appropriate mechanism here? Bear in mind that as far as I know we are talking about one broken vav and two joined qofs in the