From mark@kli.org Sun Aug 1 09:36:43 2004 Received: with ECARTIS (v1.0.0; list hebrew); Sun, 01 Aug 2004 09:36:57 -0500 (CDT) Received: from pi.meson.org (h-66-134-26-207.nycmny83.covad.net [66.134.26.207]) by unicode.org (8.12.11/8.12.11) with SMTP id i71EacYi005873 for ; Sun, 1 Aug 2004 09:36:41 -0500 Received: (qmail 29619 invoked from network); 1 Aug 2004 14:36:32 -0000 Received: from nagas.meson.org (HELO kli.org) (1000@192.168.1.101) by pi.meson.org with SMTP; 1 Aug 2004 14:36:32 -0000 Message-ID: <410CFFF0.8080803@kli.org> Date: Sun, 01 Aug 2004 10:36:32 -0400 From: "Mark E. Shoulson" User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040113 X-Accept-Language: en, fr MIME-Version: 1.0 To: Peter Kirk CC: John Cowan , John Hudson , hebrew@unicode.org Subject: [hebrew] Re: Holam background document References: <41077B05.4000208@qaya.org> <410843B7.2070900@tiro.com> <4108C0F4.8030301@qaya.org> <410940B0.9080304@tiro.com> <410985FF.7040105@qaya.org> <41099580.9090303@tiro.com> <410A3BE5.7010808@qaya.org> <410AB4E7.5020903@tiro.com> <410AE5AC.3060708@qaya.org> <410B75D1.7040307@qaya.org> <20040731173638.GL29193@ccil.org> <410BE1BF.5010908@qaya.org> In-Reply-To: <410BE1BF.5010908@qaya.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 1910 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: mark@kli.org Precedence: bulk X-list: hebrew Peter Kirk wrote: > On 31/07/2004 18:36, John Cowan wrote: > >> Peter Kirk scripsit: >> >> >> >>> 023A LATIN CAPITAL LETTER A WITH STROKE >>> 023B LATIN CAPITAL LETTER C WITH STROKE >>> 023C LATIN SMALL LETTER C WITH STROKE >>> 023D LATIN CAPITAL LETTER L WITH BAR >>> 023E LATIN CAPITAL LETTER T WITH DIAGONAL STROKE >>> >> >> >> Diacritics that cross, or even merely attach, to their bases normally >> produce novel letters in Unicode that are not decomposable, with the >> exceptions of cedilla, ogonek, and Vietnamese horn. This is primarily >> because the shape and placement of these diacritics depend heavily >> on the exact nature of the base, and the encoding of diacritics is >> primarily shape-based, at least within a given script or set of scripts. >> Some such diacritics are encoded as combining characters anyway, but >> this is primarily for nonce use. >> >> >> > I know that is the argument. My point is that these characters (and > the phonetic symbols) are logically, at the abstract character level, > combinations of a base character and a diacritic, and so should be > represented as such in Unicode (possibly with precomposed alternatives > with canonical decompositions). Because of the placement issues you > mention, the appopriate technology for rendering them is likely to be > with a single precomposed glyph, which is technically a ligature - > although the technology should not be specified in TUS (but it is so > specified if new characters are encoded!). The main argument against > that seems to be that the font technology to substitute such ligatures > is not yet always available; but the same technology is required for > acceptable rendering (except where precomposed forms are used) of base > letters with separated diacritics e.g. for the correct vertical and > horizontal positioning of accents. That's actually not true: anchor-points can position diacritics very precisely indeed, taking into account glyph-shape, etc. Not every renderer supports them, but the technology is there. Character-crossing marks are another matter, since it's not necessarily enough to position the mark properly, e.g. an oblique stroke might have to have its slant adjusted depending on the letter it's crossing, or a crosswise stroke might need its length adjusted (consider, for example, U+0268 LATIN SMALL LETTER I WITH STROKE vs. U+0289 LATIN SMALL LETTER U BAR, both of which are made from what appears to be the same crossing mark... but it has to be longer for the u than for the i), etc. Considering it's Unicode that invented the character/glyph model, or at least determined how it was going to use it, you might want to reconsider accusing it of changing it. Maybe it is you that have a different understanding of what the character/glyph model is, and thus are seeing deviations from it that exist only in your interpretation. Naturally, you're entitled to your own view of the model, but that doesn't mean UTC is being inconsistent. ~mark From peterkirk@qaya.org Mon Aug 2 05:34:11 2004 Received: with ECARTIS (v1.0.0; list hebrew); Mon, 02 Aug 2004 05:34:11 -0500 (CDT) Received: from pan.hu-pan.com (hu-pan.com [67.15.6.3]) by unicode.org (8.12.11/8.12.11) with ESMTP id i72AYA4F011406 for ; Mon, 2 Aug 2004 05:34:10 -0500 Received: from [213.162.124.237] (helo=[127.0.0.1]) by pan.hu-pan.com with asmtp (Exim 4.34) id 1Bra8v-00066p-01; Mon, 02 Aug 2004 11:34:09 +0100 Message-ID: <410E18A5.7000400@qaya.org> Date: Mon, 02 Aug 2004 11:34:13 +0100 From: Peter Kirk User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7) Gecko/20040616 X-Accept-Language: en-gb, en, en-us, az, ru, tr, he, el, fr, de MIME-Version: 1.0 To: "Mark E. Shoulson" CC: hebrew@unicode.org Subject: [hebrew] Re: Holam background document References: <41077B05.4000208@qaya.org> <410843B7.2070900@tiro.com> <4108C0F4.8030301@qaya.org> <410940B0.9080304@tiro.com> <410985FF.7040105@qaya.org> <41099580.9090303@tiro.com> <410A3BE5.7010808@qaya.org> <410AB4E7.5020903@tiro.com> <410AE5AC.3060708@qaya.org> <410B75D1.7040307@qaya.org> <20040731173638.GL29193@ccil.org> <410BE1BF.5010908@qaya.org> <410CFFF0.8080803@kli.org> In-Reply-To: <410CFFF0.8080803@kli.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - pan.hu-pan.com X-AntiAbuse: Original Domain - unicode.org X-AntiAbuse: Originator/Caller UID/GID - [0 0] / [47 12] X-AntiAbuse: Sender Address Domain - qaya.org X-Source: X-Source-Args: X-Source-Dir: X-archive-position: 1911 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: peterkirk@qaya.org Precedence: bulk X-list: hebrew On 01/08/2004 15:36, Mark E. Shoulson wrote: > Peter Kirk wrote: > >> ... The main argument against that seems to be that the font >> technology to substitute such ligatures is not yet always available; >> but the same technology is required for acceptable rendering (except >> where precomposed forms are used) of base letters with separated >> diacritics e.g. for the correct vertical and horizontal positioning >> of accents. > > > That's actually not true: anchor-points can position diacritics very > precisely indeed, taking into account glyph-shape, etc. Not every > renderer supports them, but the technology is there. > > Character-crossing marks are another matter, since it's not > necessarily enough to position the mark properly, e.g. an oblique > stroke might have to have its slant adjusted depending on the letter > it's crossing, or a crosswise stroke might need its length adjusted > (consider, for example, U+0268 LATIN SMALL LETTER I WITH STROKE vs. > U+0289 LATIN SMALL LETTER U BAR, both of which are made from what > appears to be the same crossing mark... but it has to be longer for > the u than for the i), etc. Well, I think there is some misunderstanding here on both sides. I realise that correct diacritic positioning may be implemented either by anchor points or by (technical) ligatures, whereas some character-crossing marks can be implemented well only by (technical) ligatures, or at least anchor points would need to be supplemented by contextual glyph substitution. (On the other hand, the macron on u may need to be longer than on i as well.) My point is more as follows: dumb rendering engines support neither anchor points nor ligatures; intelligent modern ones support both; there may be some which support ligatures but not anchor points (at least that is the effect when Uniscribe renders Holam in various contexts with Times New Roman - intelligence in the rendering engine but not in the font); but I have never heard of any which support anchor points without ligatures (but might this be true of an intelligent OpenType font rendered with a rather dumb rendering engine?). But only this possibly empty last class could position diacritics correctly but not render character-crossing marks as ligatures. -- Peter Kirk peter@qaya.org (personal) peterkirk@qaya.org (work) http://www.qaya.org/ From verdy_p@wanadoo.fr Mon Aug 2 06:40:48 2004 Received: with ECARTIS (v1.0.0; list hebrew); Mon, 02 Aug 2004 06:40:52 -0500 (CDT) Received: from mwinf0101.wanadoo.fr (smtp1.wanadoo.fr [193.252.22.30]) by unicode.org (8.12.11/8.12.11) with ESMTP id i72BelbM000490 for ; Mon, 2 Aug 2004 06:40:47 -0500 Received: from VENGEROV (malakoff-1-82-67-109-84.fbx.proxad.net [82.67.109.84]) by mwinf0101.wanadoo.fr (SMTP Server) with ESMTP id 40015740014E; Mon, 2 Aug 2004 13:40:41 +0200 (CEST) Message-ID: <018801c47885$89c2e050$6801a8c0@VENGEROV> From: "Philippe Verdy" To: "John Hudson" , "Peter Kirk" Cc: References: <000301c47273$c2717f00$0100000a@QSM4> <030d01c47277$ab2ca140$3278fe51@VENGEROV> <41055148.6030402@tiro.com> <41056E86.6070102@qaya.org> <4106AB5B.2060604@tiro.com> <4106D619.5050203@qaya.org> <41071CEE.4050005@tiro.com> <20040728034816.GD11028@ccil.org> <4107279B.7040705@tiro.com> <41077B05.4000208@qaya.org> <410843B7.2070900@tiro.com> <4108C0F4.8030301@qaya.org> <410940B0.9080304@tiro.com> <410985FF.7040105@qaya.org> <41099580.9090303@tiro.com> <410A3BE5.7010808@qaya.org> <410AB4E7.5020903@tiro.com> Subject: [hebrew] Re: Holam background document Date: Mon, 2 Aug 2004 13:40:39 +0200 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1437 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1441 X-archive-position: 1912 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: verdy_p@wanadoo.fr Precedence: bulk X-list: hebrew From: "John Hudson" > >> Now do you begin to see the problem? Unless ZWNJ happens to be painted > >> by a particular rendering system, there is absolutely no distinction > >> at the glyph level: > >> > >> = /vav/holam/ > >> = /vav/holam/ > > > > This still presupposes that there is no ligature lookup for the fomer. > > No. This is before we get to the ligature. Ligatures are glyph-level substitutions. > Ligature result from glyph strings. Here the glyph string is /vav/holam/. It is > /vav/holam/ before we decide whether /vav/holam/ might be rendered as /vavhaluma/ or > /holammale/. Here I really must oppose an argument to this over-simplification of what is called here a "glyph string". Actually, in a renderer, a "glyph string" will not contain only an ordered list of glyphs. Instead, each glyph in this list is ATTRIBUTED, notably with contextual rendering flags which controls how a glyph string can be parsed or splitted into smaller "glyph strings" that will be looked up in glyph substitution tables. OK, in a OpenType font, the GSUB/GPOS tables will only work with strings of Glyph IDs, but the model is still flexible enough to allow using special glyph ids to represent the glyph substitution exclusions. In addition, these tables of substitutions are processed in order: a more precise rule that matches the case of subsitution exclusions will be present in such table before the default substitution rule which excludes this control; in the font itself, the remaining special glyph ID will render as a void glyph, that will be safely ignored. Font technologies can contain more contextual informations for rendering texts processed as "glyph strings". Not all needs to be encoded within the font itself, but instead in the renderer; and there are many possible "feature" tables which can be added in the font to be used by an extended renderer. So in fact you would have really such distinctions in the renderer: = /vav/(forbid ligature before=N, forbid ligature after=N) /holam/(forbid ligature before=N, forbid ligature after=N) = /vav/(forbid ligature before=N, forbid ligature after=Y) /holam/(forbid ligature before=Y, forbid ligature after=N) The renderer could then use some "feature" table in the font that controls how to encode forbidden ligatures. Suppose there's a special glyph id created for that, then the renderer will transform the attributed glyph string into glyphs strings like: = /vav/ /holam/ = /vav/ /forbid ligature after/ /forbid ligature before/ /holam/ for use in GSUB/GPOS lookup tables. And so the font and the renderer cann fully make the necessary distinctions. All this is possible without changing the Unicode encoding of plain-text. So NOTHING prevent a renderer of making a graphical and coherent distinction between and , and so we already have all we need in Unicode to make the distinction between holam male and vav haluma. All the "glyph string" processing above is out of scope of Unicode. The way that Unicode abstract characters are converted into "glyph strings", and how glyphs are represented in such strings is irrelevant for Unicode itself. It's only part of fonts and renderer technologies. If you look precisely at the Uniscribe engine API for example, you'll see that text processing uses MANY contextual rendering attributes in addition to glyph IDs... The exact structure is quite complex and can already, today, be controled in fonts created using special "feature" tables. The generic GSUB/GPOS tables are common in OpenType, but there are many more features and tables ion OpenType or other font technologies to represent these contextual attributes. So please, don't think about "glyph strings" in terms of a simple vector of glyph IDs. A more exact term would be "attributed glyph strings", i.e. a vector of attributed glyphs... From tiro@tiro.com Mon Aug 2 13:04:24 2004 Received: with ECARTIS (v1.0.0; list hebrew); Mon, 02 Aug 2004 13:04:24 -0500 (CDT) Received: from priv-edtnes40.telusplanet.net (outbound05.telus.net [199.185.220.224]) by unicode.org (8.12.11/8.12.11) with ESMTP id i72I4K2F026050 for ; Mon, 2 Aug 2004 13:04:24 -0500 Received: from tiro.com ([154.5.29.179]) by priv-edtnes40.telusplanet.net (InterMail vM.6.01.03.02 201-2131-111-104-20040324) with ESMTP id <20040802180412.OYOQ23238.priv-edtnes40.telusplanet.net@tiro.com> for ; Mon, 2 Aug 2004 12:04:12 -0600 Message-ID: <410E81EF.6090906@tiro.com> Date: Mon, 02 Aug 2004 11:03:27 -0700 From: John Hudson User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.6b) Gecko/20031205 Thunderbird/0.4 X-Accept-Language: en-us, en MIME-Version: 1.0 To: hebrew@unicode.org Subject: [hebrew] Re: Holam background document References: <000301c47273$c2717f00$0100000a@QSM4> <030d01c47277$ab2ca140$3278fe51@VENGEROV> <41055148.6030402@tiro.com> <41056E86.6070102@qaya.org> <4106AB5B.2060604@tiro.com> <4106D619.5050203@qaya.org> <41071CEE.4050005@tiro.com> <20040728034816.GD11028@ccil.org> <4107279B.7040705@tiro.com> <41077B05.4000208@qaya.org> <410843B7.2070900@tiro.com> <4108C0F4.8030301@qaya.org> <410940B0.9080304@tiro.com> <410985FF.7040105@qaya.org> <41099580.9090303@tiro.com> <410A3BE5.7010808@qaya.org> <410AB4E7.5020903@tiro.com> <018801c47885$89c2e050$6801a8c0@VENGEROV> In-Reply-To: <018801c47885$89c2e050$6801a8c0@VENGEROV> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 1913 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: tiro@tiro.com Precedence: bulk X-list: hebrew Philippe Verdy wrote: > All the "glyph string" processing above is out of scope of Unicode.... I agree, which is why I have a problem with a proposed encoding for the holam male / vav haluma distinction that requires very specific kinds of glyph processing acrobatics that are not the model use by perfectly valid Unicode rendering systems. E.g. > The renderer could then use some "feature" table in the font that controls > how to encode forbidden ligatures. Suppose there's a special glyph id > created for that, then the renderer will transform the attributed glyph > string into glyphs strings like: > = > /vav/ > /holam/ > = > /vav/ /forbid ligature after/ > /forbid ligature before/ /holam/ > for use in GSUB/GPOS lookup tables. Yes, a renderer could be designed to work in this way, and fonts could be designed to work with such a renderer. The issue is not whether rendering systems can be made that would support the holam male / vav haluma distinction in this way, but whether the distinction can be encoded in such a way that it will work reliably in multiple rendering systems using different glyph processing models. I'm much more concerned about existing rendering systems than I am about imaginary ones. John Hudson -- Tiro Typeworks www.tiro.com Vancouver, BC tiro@tiro.com Currently reading: The Mass in slow motion, by Ronald Knox Hebrew manuscripts of the Middle Ages, by Colette Sirat Breaking the South Slav dream, by Kate Hudson From peterkirk@qaya.org Mon Aug 2 13:33:07 2004 Received: with ECARTIS (v1.0.0; list hebrew); Mon, 02 Aug 2004 13:33:07 -0500 (CDT) Received: from pan.hu-pan.com (hu-pan.com [67.15.6.3]) by unicode.org (8.12.11/8.12.11) with ESMTP id i72IX7Vc001928 for ; Mon, 2 Aug 2004 13:33:07 -0500 Received: from [213.162.124.237] (helo=[127.0.0.1]) by pan.hu-pan.com with asmtp (Exim 4.34) id 1BrhcP-0002Lz-UQ; Mon, 02 Aug 2004 19:33:06 +0100 Message-ID: <410E88E7.2020405@qaya.org> Date: Mon, 02 Aug 2004 19:33:11 +0100 From: Peter Kirk User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7) Gecko/20040616 X-Accept-Language: en-gb, en, en-us, az, ru, tr, he, el, fr, de MIME-Version: 1.0 To: John Hudson CC: hebrew@unicode.org Subject: [hebrew] Re: Holam background document References: <000301c47273$c2717f00$0100000a@QSM4> <030d01c47277$ab2ca140$3278fe51@VENGEROV> <41055148.6030402@tiro.com> <41056E86.6070102@qaya.org> <4106AB5B.2060604@tiro.com> <4106D619.5050203@qaya.org> <41071CEE.4050005@tiro.com> <20040728034816.GD11028@ccil.org> <4107279B.7040705@tiro.com> <41077B05.4000208@qaya.org> <410843B7.2070900@tiro.com> <4108C0F4.8030301@qaya.org> <410940B0.9080304@tiro.com> <410985FF.7040105@qaya.org> <41099580.9090303@tiro.com> <410A3BE5.7010808@qaya.org> <410AB4E7.5020903@tiro.com> <018801c47885$89c2e050$6801a8c0@VENGEROV> <410E81EF.6090906@tiro.com> In-Reply-To: <410E81EF.6090906@tiro.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - pan.hu-pan.com X-AntiAbuse: Original Domain - unicode.org X-AntiAbuse: Originator/Caller UID/GID - [0 0] / [47 12] X-AntiAbuse: Sender Address Domain - qaya.org X-Source: X-Source-Args: X-Source-Dir: X-archive-position: 1914 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: peterkirk@qaya.org Precedence: bulk X-list: hebrew On 02/08/2004 19:03, John Hudson wrote: > ... > > Yes, a renderer could be designed to work in this way, and fonts could > be designed to work with such a renderer. The issue is not whether > rendering systems can be made that would support the holam male / vav > haluma distinction in this way, but whether the distinction can be > encoded in such a way that it will work reliably in multiple rendering > systems using different glyph processing models. I'm much more > concerned about existing rendering systems than I am about imaginary > ones. > I have demonstrated that my proposal is feasible with one glyph processing model, within the particular rendering system which you had in mind. Is it really necessary for me to demonstrate that other models are feasible? In fact Philippe has demonstrated another model which would work, but maybe only with a different rendering system. And there are potentially many others. So there are very many implementation choices for the proposal. If your choice is further restricted, it is by your rendering system, and not by my proposal or by TUS. -- Peter Kirk peter@qaya.org (personal) peterkirk@qaya.org (work) http://www.qaya.org/ From everson@evertype.com Mon Aug 2 16:24:02 2004 Received: with ECARTIS (v1.0.0; list hebrew); Mon, 02 Aug 2004 16:24:02 -0500 (CDT) Received: from ni-mail3.dna.utvinternet.net (ni-mail3.dna.utvinternet.net [194.46.8.37]) by unicode.org (8.12.11/8.12.11) with ESMTP id i72LO1U7001182 for ; Mon, 2 Aug 2004 16:24:02 -0500 Received: from [192.168.0.3] (unverified [195.218.109.86]) by ni-mail3.dna.utvinternet.net (Vircom SMTPRS 3.1.302.0) with ESMTP id for ; Mon, 2 Aug 2004 22:23:55 +0100 Mime-Version: 1.0 X-Sender: evr001@mail.dna.ie Message-Id: In-Reply-To: <410E81EF.6090906@tiro.com> References: <000301c47273$c2717f00$0100000a@QSM4> <030d01c47277$ab2ca140$3278fe51@VENGEROV> <41055148.6030402@tiro.com> <41056E86.6070102@qaya.org> <4106AB5B.2060604@tiro.com> <4106D619.5050203@qaya.org> <41071CEE.4050005@tiro.com> <20040728034816.GD11028@ccil.org> <4107279B.7040705@tiro.com> <41077B05.4000208@qaya.org> <410843B7.2070900@tiro.com> <4108C0F4.8030301@qaya.org> <410940B0.9080304@tiro.com> <410985FF.7040105@qaya.org> <41099580.9090303@tiro.com> <410A3BE5.7010808@qaya.org> <410AB4E7.5020903@tiro.com> <018801c47885$89c2e050$6801a8c0@VENGEROV> <410E81EF.6090906@tiro.com> Date: Mon, 2 Aug 2004 22:23:48 +0100 To: hebrew@unicode.org From: Michael Everson Subject: [hebrew] Re: Holam background document Content-Type: text/plain; charset="us-ascii" ; format="flowed" X-archive-position: 1915 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: everson@evertype.com Precedence: bulk X-list: hebrew At 11:03 -0700 2004-08-02, John Hudson wrote: >The issue is not whether rendering systems can be made that would >support the holam male / vav haluma distinction in this way, but >whether the distinction can be encoded in such a way that it will >work reliably in multiple rendering systems using different glyph >processing models. The distinction made can be correctly encoded with the proposed HEBREW POINT HOLAM HASER FOR VAV with no particular implementation difficulties. -- Michael Everson * * Everson Typography * * http://www.evertype.com From peterkirk@qaya.org Mon Aug 2 16:39:32 2004 Received: with ECARTIS (v1.0.0; list hebrew); Mon, 02 Aug 2004 16:39:32 -0500 (CDT) Received: from pan.hu-pan.com (hu-pan.com [67.15.6.3]) by unicode.org (8.12.11/8.12.11) with ESMTP id i72LdSAD010622 for ; Mon, 2 Aug 2004 16:39:32 -0500 Received: from [213.162.124.237] (helo=[127.0.0.1]) by pan.hu-pan.com with asmtp (Exim 4.34) id 1BrkWl-0007da-Ki; Mon, 02 Aug 2004 22:39:28 +0100 Message-ID: <410EB495.3080403@qaya.org> Date: Mon, 02 Aug 2004 22:39:33 +0100 From: Peter Kirk User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7) Gecko/20040616 X-Accept-Language: en-gb, en, en-us, az, ru, tr, he, el, fr, de MIME-Version: 1.0 To: Michael Everson CC: hebrew@unicode.org Subject: [hebrew] Re: Holam background document References: <000301c47273$c2717f00$0100000a@QSM4> <030d01c47277$ab2ca140$3278fe51@VENGEROV> <41055148.6030402@tiro.com> <41056E86.6070102@qaya.org> <4106AB5B.2060604@tiro.com> <4106D619.5050203@qaya.org> <41071CEE.4050005@tiro.com> <20040728034816.GD11028@ccil.org> <4107279B.7040705@tiro.com> <41077B05.4000208@qaya.org> <410843B7.2070900@tiro.com> <4108C0F4.8030301@qaya.org> <410940B0.9080304@tiro.com> <410985FF.7040105@qaya.org> <41099580.9090303@tiro.com> <410A3BE5.7010808@qaya.org> <410AB4E7.5020903@tiro.com> <018801c47885$89c2e050$6801a8c0@VENGEROV> <410E81EF.6090906@tiro.com> In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - pan.hu-pan.com X-AntiAbuse: Original Domain - unicode.org X-AntiAbuse: Originator/Caller UID/GID - [0 0] / [47 12] X-AntiAbuse: Sender Address Domain - qaya.org X-Source: X-Source-Args: X-Source-Dir: X-archive-position: 1916 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: peterkirk@qaya.org Precedence: bulk X-list: hebrew On 02/08/2004 22:23, Michael Everson wrote: > At 11:03 -0700 2004-08-02, John Hudson wrote: > >> The issue is not whether rendering systems can be made that would >> support the holam male / vav haluma distinction in this way, but >> whether the distinction can be encoded in such a way that it will >> work reliably in multiple rendering systems using different glyph >> processing models. > > > The distinction made can be correctly encoded with the proposed HEBREW > POINT HOLAM HASER FOR VAV with no particular implementation difficulties. If every Arabic presentation form and every Indic ligature had been encoded as a separate character, that would have simplified rendering a great deal, and avoided all sorts of nasty problems like the ones in Public Review Issue #37. But Unicode chose instead the character/glyph model. It should stick with it. -- Peter Kirk peter@qaya.org (personal) peterkirk@qaya.org (work) http://www.qaya.org/ From tiro@tiro.com Mon Aug 2 17:05:09 2004 Received: with ECARTIS (v1.0.0; list hebrew); Mon, 02 Aug 2004 17:05:09 -0500 (CDT) Received: from priv-edtnes40.telusplanet.net (outbound05.telus.net [199.185.220.224]) by unicode.org (8.12.11/8.12.11) with ESMTP id i72M54WD013533 for ; Mon, 2 Aug 2004 17:05:08 -0500 Received: from tiro.com ([154.5.29.179]) by priv-edtnes40.telusplanet.net (InterMail vM.6.01.03.02 201-2131-111-104-20040324) with ESMTP id <20040802220453.DNWI23238.priv-edtnes40.telusplanet.net@tiro.com>; Mon, 2 Aug 2004 16:04:53 -0600 Message-ID: <410EBA58.9040908@tiro.com> Date: Mon, 02 Aug 2004 15:04:08 -0700 From: John Hudson User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.6b) Gecko/20031205 Thunderbird/0.4 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Peter Kirk CC: Michael Everson , hebrew@unicode.org, Peter Constable , Ken Whistler Subject: [hebrew] Re: Holam background document References: <000301c47273$c2717f00$0100000a@QSM4> <030d01c47277$ab2ca140$3278fe51@VENGEROV> <41055148.6030402@tiro.com> <41056E86.6070102@qaya.org> <4106AB5B.2060604@tiro.com> <4106D619.5050203@qaya.org> <41071CEE.4050005@tiro.com> <20040728034816.GD11028@ccil.org> <4107279B.7040705@tiro.com> <41077B05.4000208@qaya.org> <410843B7.2070900@tiro.com> <4108C0F4.8030301@qaya.org> <410940B0.9080304@tiro.com> <410985FF.7040105@qaya.org> <41099580.9090303@tiro.com> <410A3BE5.7010808@qaya.org> <410AB4E7.5020903@tiro.com> <018801c47885$89c2e050$6801a8c0@VENGEROV> <410E81EF.6090906@tiro.com> <410EB495.3080403@qaya.org> In-Reply-To: <410EB495.3080403@qaya.org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 1917 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: tiro@tiro.com Precedence: bulk X-list: hebrew Peter Kirk wrote: > If every Arabic presentation form and every Indic ligature had been > encoded as a separate character, that would have simplified rendering a > great deal, and avoided all sorts of nasty problems like the ones in > Public Review Issue #37. But Unicode chose instead the character/glyph > model. It should stick with it. Holam male is semantically distinct from vav haluma. The position relative to the base and sometimes the size of the dot used in holam male is properly distinct from that used in vav haluma and in every other application of holam to a base letter. Only in 'less exact' typography -- and as a typographer I make no special distinction between 'less exact' and 'poorer quality' -- is this distinction confused and ambiguity permitted. To me it is perfectly obvious that the dot used in holam male should never have been unified with the holam haser: these are separate characters according to pretty much every criteria that Unicode has ever used to determine character identity. As far as I'm concerned, we're trying to correct a mistake in the Hebrew block. There is only *one* logical way to do this that is perfectly consistent with the character/glyph model and the identity of the dot on holam male. That is to separately encode the dot for holam male character that should have been encoded in the first place. This solution has been rejected because of existing data that uses the existing holam dot for holam male, the relative frequency of holam male, and the acceptance of the holam male formation as an ambiguous rendering by 'less exact' typographers. It should be perfectly obvious to anyone that this rejection forces us into a position of compromise, because whatever solution is selected will not be the one logical solution that is consistent with the character/glyph model and the identity of holam male dot as a separate character. It is precisely because the solution will be a compromise that no one is going to be completely happy with the result, but also why there is no point on standing on principle or launching objections from fundamentals that have already been compromised. We need a solution that works reliably and gets the job done, and that's all we can really hope for. Purity isn't on the table. John Hudson -- Tiro Typeworks www.tiro.com Vancouver, BC tiro@tiro.com Currently reading: The Mass in slow motion, by Ronald Knox Hebrew manuscripts of the Middle Ages, by Colette Sirat Breaking the South Slav dream, by Kate Hudson From peterkirk@qaya.org Mon Aug 2 17:49:31 2004 Received: with ECARTIS (v1.0.0; list hebrew); Mon, 02 Aug 2004 17:49:35 -0500 (CDT) Received: from pan.hu-pan.com (hu-pan.com [67.15.6.3]) by unicode.org (8.12.11/8.12.11) with ESMTP id i72MnVd7028459 for ; Mon, 2 Aug 2004 17:49:31 -0500 Received: from [213.162.124.237] (helo=[127.0.0.1]) by pan.hu-pan.com with asmtp (Exim 4.34) id 1BrlcU-00013s-5C; Mon, 02 Aug 2004 23:49:26 +0100 Message-ID: <410EC4FA.7090000@qaya.org> Date: Mon, 02 Aug 2004 23:49:30 +0100 From: Peter Kirk User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7) Gecko/20040616 X-Accept-Language: en-gb, en, en-us, az, ru, tr, he, el, fr, de MIME-Version: 1.0 To: John Hudson CC: Michael Everson , hebrew@unicode.org, Peter Constable , Ken Whistler Subject: [hebrew] Re: Holam background document References: <000301c47273$c2717f00$0100000a@QSM4> <030d01c47277$ab2ca140$3278fe51@VENGEROV> <41055148.6030402@tiro.com> <41056E86.6070102@qaya.org> <4106AB5B.2060604@tiro.com> <4106D619.5050203@qaya.org> <41071CEE.4050005@tiro.com> <20040728034816.GD11028@ccil.org> <4107279B.7040705@tiro.com> <41077B05.4000208@qaya.org> <410843B7.2070900@tiro.com> <4108C0F4.8030301@qaya.org> <410940B0.9080304@tiro.com> <410985FF.7040105@qaya.org> <41099580.9090303@tiro.com> <410A3BE5.7010808@qaya.org> <410AB4E7.5020903@tiro.com> <018801c47885$89c2e050$6801a8c0@VENGEROV> <410E81EF.6090906@tiro.com> <410EB495.3080403@qaya.org> <410EBA58.9040908@tiro.com> In-Reply-To: <410EBA58.9040908@tiro.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - pan.hu-pan.com X-AntiAbuse: Original Domain - unicode.org X-AntiAbuse: Originator/Caller UID/GID - [0 0] / [47 12] X-AntiAbuse: Sender Address Domain - qaya.org X-Source: X-Source-Args: X-Source-Dir: X-archive-position: 1918 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: peterkirk@qaya.org Precedence: bulk X-list: hebrew On 02/08/2004 23:04, John Hudson wrote: > Peter Kirk wrote: > >> If every Arabic presentation form and every Indic ligature had been >> encoded as a separate character, that would have simplified rendering >> a great deal, and avoided all sorts of nasty problems like the ones >> in Public Review Issue #37. But Unicode chose instead the >> character/glyph model. It should stick with it. > > > Holam male is semantically distinct from vav haluma. The position > relative to the base and sometimes the size of the dot used in holam > male is properly distinct from that used in vav haluma and in every > other application of holam to a base letter. Only in 'less exact' > typography -- and as a typographer I make no special distinction > between 'less exact' and 'poorer quality' -- is this distinction > confused and ambiguity permitted. To me it is perfectly obvious that > the dot used in holam male should never have been unified with the > holam haser: these are separate characters according to pretty much > every criteria that Unicode has ever used to determine character > identity. Even if we agree to consider only "exact" typography, I don't think I can quite agree with you here. I agree that *graphical* tests suggest a separate identity, but this implies a separate glyph but not necessarily a separate character. But non-graphical tests are less conclusive. Is this distinction always made? No. Is this distinction required for proper understanding or reading of the text? Clearly not, because it is often not made. In fact, readers are able to determine whether a VAV with HOLAM is Vav Haluma or Holam Male from whether it is preceded by a vowel or a consonant, and so independently of whether a graphical distinction is made. Very probably, when reading a text in which the distinction is made, readers will disambiguate from this vowel or consonant rule, and pay little attention to the precise position of the dot. All of this suggests that the distinction is not really semantic, but is rather a glyph variation. It is one which is in principle determined by the context, at least in connected fully pointed text, but the glyph selection cannot be made by the rendering engine. This suggests that it is entirely appropriate to make the distinction with ZWNJ, and take the risk that some less fully featured rendering engines will fail to distinguish the renderings properly. And then Jony has already repeatedly objected to "exact" typography being considered normative, and certainly to the deliberate decision not to make the distinction being called "poorer quality". Less exact typography is more common, and is a valid typographical choice. The distinction is an optional one, and neither text providers nor fonts should be obliged to support it. > > As far as I'm concerned, we're trying to correct a mistake in the > Hebrew block. There is only *one* logical way to do this that is > perfectly consistent with the character/glyph model and the identity > of the dot on holam male. That is to separately encode the dot for > holam male character that should have been encoded in the first place. I agree that this is the one logical way given your presuppositions about semantic significance. But I can also see that this would be objectionable to those who consider the distinction optional. > > This solution has been rejected because of existing data that uses the > existing holam dot for holam male, the relative frequency of holam > male, and the acceptance of the holam male formation as an ambiguous > rendering by 'less exact' typographers. > > It should be perfectly obvious to anyone that this rejection forces us > into a position of compromise, because whatever solution is selected > will not be the one logical solution that is consistent with the > character/glyph model and the identity of holam male dot as a separate > character. > > It is precisely because the solution will be a compromise that no one > is going to be completely happy with the result, but also why there is > no point on standing on principle or launching objections from > fundamentals that have already been compromised. We need a solution > that works reliably and gets the job done, and that's all we can > really hope for. Purity isn't on the table. I agree. But what is the job which needs to be done? Clearly the majority of Hebrew users at least on this list don't accept that the new character actually does the right job, not least because it appears to make an optional distinction mandatory. Please, everyone, listen to their views. Otherwise you will just end up with a new character which no one will use, because the user community see the solution as worse than the problem to be solved. This would be a waste of everyone's time. -- Peter Kirk peter@qaya.org (personal) peterkirk@qaya.org (work) http://www.qaya.org/ From mark@kli.org Mon Aug 2 18:07:51 2004 Received: with ECARTIS (v1.0.0; list hebrew); Mon, 02 Aug 2004 18:07:51 -0500 (CDT) Received: from pi.meson.org (h-66-134-26-207.nycmny83.covad.net [66.134.26.207]) by unicode.org (8.12.11/8.12.11) with SMTP id i72N7nm0031837 for ; Mon, 2 Aug 2004 18:07:51 -0500 Received: (qmail 10297 invoked from network); 2 Aug 2004 23:07:42 -0000 Received: from nagas.meson.org (HELO kli.org) (1000@192.168.1.101) by pi.meson.org with SMTP; 2 Aug 2004 23:07:42 -0000 Message-ID: <410EC93E.6070706@kli.org> Date: Mon, 02 Aug 2004 19:07:42 -0400 From: "Mark E. Shoulson" User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040113 X-Accept-Language: en, fr MIME-Version: 1.0 To: John Hudson CC: Peter Kirk , Michael Everson , hebrew@unicode.org, Peter Constable , Ken Whistler Subject: [hebrew] Re: Holam background document References: <000301c47273$c2717f00$0100000a@QSM4> <030d01c47277$ab2ca140$3278fe51@VENGEROV> <41055148.6030402@tiro.com> <41056E86.6070102@qaya.org> <4106AB5B.2060604@tiro.com> <4106D619.5050203@qaya.org> <41071CEE.4050005@tiro.com> <20040728034816.GD11028@ccil.org> <4107279B.7040705@tiro.com> <41077B05.4000208@qaya.org> <410843B7.2070900@tiro.com> <4108C0F4.8030301@qaya.org> <410940B0.9080304@tiro.com> <410985FF.7040105@qaya.org> <41099580.9090303@tiro.com> <410A3BE5.7010808@qaya.org> <410AB4E7.5020903@tiro.com> <018801c47885$89c2e050$6801a8c0@VENGEROV> <410E81EF.6090906@tiro.com> <410EB495.3080403@qaya.org> <410EBA58.9040908@tiro.com> In-Reply-To: <410EBA58.9040908@tiro.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 1919 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: mark@kli.org Precedence: bulk X-list: hebrew I don't know that this matters, and I think it doesn't, but it might be worth noting that I've seen, in documents that do not distinguish holam-male from vav-haluma, the holam-dot above and to the *left* of the vav. That is, it isn't a matter of left vs right vs in-between, it's {left vs non-left} on one hand and {someplace, the same place for both} on the other. ~mark Still looking for LATIN SMALL LETTER AMBIGUOUS LONG OR SHORT S, to handle the three-way distinction among documents that distinguish them and documents that don't... From everson@evertype.com Mon Aug 2 20:40:17 2004 Received: with ECARTIS (v1.0.0; list hebrew); Mon, 02 Aug 2004 20:40:17 -0500 (CDT) Received: from ni-mail3.dna.utvinternet.net (ni-mail3.dna.utvinternet.net [194.46.8.37]) by unicode.org (8.12.11/8.12.11) with ESMTP id i731eFrT002488; Mon, 2 Aug 2004 20:40:16 -0500 Received: from [192.168.0.3] (unverified [195.218.109.86]) by ni-mail3.dna.utvinternet.net (Vircom SMTPRS 3.1.302.0) with ESMTP id ; Tue, 3 Aug 2004 02:40:08 +0100 Mime-Version: 1.0 X-Sender: evr001@mail.dna.ie Message-Id: In-Reply-To: <410EBA58.9040908@tiro.com> References: <000301c47273$c2717f00$0100000a@QSM4> <030d01c47277$ab2ca140$3278fe51@VENGEROV> <41055148.6030402@tiro.com> <41056E86.6070102@qaya.org> <4106AB5B.2060604@tiro.com> <4106D619.5050203@qaya.org> <41071CEE.4050005@tiro.com> <20040728034816.GD11028@ccil.org> <4107279B.7040705@tiro.com> <41077B05.4000208@qaya.org> <410843B7.2070900@tiro.com> <4108C0F4.8030301@qaya.org> <410940B0.9080304@tiro.com> <410985FF.7040105@qaya.org> <41099580.9090303@tiro.com> <410A3BE5.7010808@qaya.org> <410AB4E7.5020903@tiro.com> <018801c47885$89c2e050$6801a8c0@VENGEROV> <410E81EF.6090906@tiro.com> <410EB495.3080403@qaya.org> <410EBA58.9040908@tiro.com> Date: Tue, 3 Aug 2004 02:40:00 +0100 To: hebrew@unicode.org From: Michael Everson Subject: [hebrew] Re: Holam background document Cc: Peter Constable , Ken Whistler Content-Type: text/plain; charset="us-ascii" ; format="flowed" X-archive-position: 1920 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: everson@evertype.com Precedence: bulk X-list: hebrew At 15:04 -0700 2004-08-02, John Hudson wrote: >As far as I'm concerned, we're trying to correct a mistake in the >Hebrew block. There is only *one* logical way to do this that is >perfectly consistent with the character/glyph model and the identity >of the dot on holam male. That is to separately encode the dot for >holam male character that should have been encoded in the first >place. I disagree, John. Ada Yardeni says quite explicitly: "Holam [is] a dot above a Waw or to the left of the upper corner of any letter." and "The Holam on the Waw should be placed above the centre of its "head", while the Holam on other letters should be placed to the left of their upper corners." Yardeni has described perfectly what I judge to be consensus on this list. She describes the default position of the POINT HOLAM on consonants, and describes its default position on VAV in particular. Statistics have certainly shown that this description is the best, and that holam male is the default use of POINT HOLAM on VAV. >It should be perfectly obvious to anyone that this rejection forces >us into a position of compromise, because whatever solution is >selected will not be the one logical solution that is consistent >with the character/glyph model and the identity of holam male dot as >a separate character. I disagree. I believe that when the tradition moves the dot it is creating a new character, whether or not the dot has a similar function. Of course, a dot is a dot. But n-with-dot-above is used to represent the velar nasal consonant in Latin transliterations of Brahmic scripts, and n-with-dot-below is used to represent the retroflex nasal consonant. Moving the dot makes a difference, and it makes sense to encode the two characters separately. Similarly we have the case in Hebrew. The tradition agrees that the position of the dot has significance. The tradition -- and I consider Yardeni to be authoritative in speaking for the tradition -- suggests quite clearly what the default behaviour of HOLAM is. The tradition places a dot in a different place when it has a different meaning. It appears to be a unique solution, intended for use only with a particular letter, which is why Mark and I called it POINT HOLAM HASER FOR VAV. >It is precisely because the solution will be a compromise that no >one is going to be completely happy with the result, but also why >there is no point on standing on principle or launching objections >from fundamentals that have already been compromised. We need a >solution that works reliably and gets the job done, and that's all >we can really hope for. Purity isn't on the table. The joiner "solution" is an ill-conceived hack which tries to press the ZWNJ, whose behaviour is appropriate for cursive scripts (like Arabic or Brahmic scripts), into service for the non-cursive Hebrew script, on foot of a presumption that Hebrew points form ligatures with Hebrew base-letters. If "purity" is to be considered, the intrinsic cursivity or non-cursivity of a script, and indeed the concept of "ligature", should be taken into account. The HOLAM HASER FOR VAV proposal is simple. It recognizes that HEBREW POINT HOLAM is the default character used for a dot above representing [o] in the Hebrew script. It recognizes that there are many implementations which do not distinguish VAV + HOLAM graphically when it could mean either [o] or [vo]. It recognizes that, in implementations which *do* make such a distinction, that the dot for VAV + HOLAM with the meaning [o] is centred over the VAV, rather than positioned to the left as it is for all other characters, and it recognizes that the dot for VAV + HOLAM with the meaning [vo] is positioned to the left. It recognizes that the left positioning of the HOLAM over VAV is a *marked* positioning, and it is for this reason that a new character was proposed for this particular usage. This is completely analogous to the model accepted by the UTC and WG2 for the character QAMATS QATAN. The proposal for that character recognized that there are many implementations which do not distinguish QAMATS when it could mean either [a] or [o]. It recognized that in implementations which *do* make such a distinction, that the shape for QAMATS QATAN with the meaning [o] is different from the normal shape for QAMATS, and it recognized that the shape for QAMATS QATAN with the meaning [o] is larger, or otherwise differently shaped. It recognized that the special shape of the QAMATS QATAN is a *marked* shape, and it is for this reason that a new character was proposed for this particular usage. The fact that the distinction between QAMATS and QAMATS QATAN was made in the twentieth century, and that the distinction between HOLAM and HOLAM HASER FOR VAV was made in the eleventh century, is irrelevant to the logic which suggests that both characters indicate a marked form different from a default unmarked form. Both solutions use the same logic for differentiating between an marked and an unmarked representation for Hebrew text. -- Michael Everson * * Everson Typography * * http://www.evertype.com From tiro@tiro.com Mon Aug 2 21:29:03 2004 Received: with ECARTIS (v1.0.0; list hebrew); Mon, 02 Aug 2004 21:29:03 -0500 (CDT) Received: from priv-edtnes28.telusplanet.net (outbound04.telus.net [199.185.220.223]) by unicode.org (8.12.11/8.12.11) with ESMTP id i732Sx8u023433 for ; Mon, 2 Aug 2004 21:29:03 -0500 Received: from tiro.com ([154.5.29.179]) by priv-edtnes28.telusplanet.net (InterMail vM.6.01.03.02 201-2131-111-104-20040324) with ESMTP id <20040803022852.YESJ6727.priv-edtnes28.telusplanet.net@tiro.com>; Mon, 2 Aug 2004 20:28:52 -0600 Message-ID: <410EF837.1070606@tiro.com> Date: Mon, 02 Aug 2004 19:28:07 -0700 From: John Hudson User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.6b) Gecko/20031205 Thunderbird/0.4 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Michael Everson CC: hebrew@unicode.org, Peter Constable , Ken Whistler Subject: [hebrew] Re: Holam background document References: <000301c47273$c2717f00$0100000a@QSM4> <030d01c47277$ab2ca140$3278fe51@VENGEROV> <41055148.6030402@tiro.com> <41056E86.6070102@qaya.org> <4106AB5B.2060604@tiro.com> <4106D619.5050203@qaya.org> <41071CEE.4050005@tiro.com> <20040728034816.GD11028@ccil.org> <4107279B.7040705@tiro.com> <41077B05.4000208@qaya.org> <410843B7.2070900@tiro.com> <4108C0F4.8030301@qaya.org> <410940B0.9080304@tiro.com> <410985FF.7040105@qaya.org> <41099580.9090303@tiro.com> <410A3BE5.7010808@qaya.org> <410AB4E7.5020903@tiro.com> <018801c47885$89c2e050$6801a8c0@VENGEROV> <410E81EF.6090906@tiro.com> <410EB495.3080403@qaya.org> <410EBA58.9040908@tiro.com> In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 1921 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: tiro@tiro.com Precedence: bulk X-list: hebrew Michael Everson wrote: > Yardeni has described perfectly what I judge to be consensus on this > list. She describes the default position of the POINT HOLAM on > consonants, and describes its default position on VAV in particular. > Statistics have certainly shown that this description is the best, and > that holam male is the default use of POINT HOLAM on VAV. This is fairly convincing, and doesn't bother me in the least. If you can argue that the holam haser for vav not only works but is, in fact, the logical separate character, more power to you. John Hudson -- Tiro Typeworks www.tiro.com Vancouver, BC tiro@tiro.com Currently reading: The Mass in slow motion, by Ronald Knox Hebrew manuscripts of the Middle Ages, by Colette Sirat Breaking the South Slav dream, by Kate Hudson From rosennej@qsm.co.il Mon Aug 2 22:44:23 2004 Received: with ECARTIS (v1.0.0; list hebrew); Mon, 02 Aug 2004 22:44:44 -0500 (CDT) Received: from mx-out.daemonmail.net (mx-out.daemonmail.net [216.104.160.39]) by unicode.org (8.12.11/8.12.11) with ESMTP id i733iNZV009154; Mon, 2 Aug 2004 22:44:23 -0500 Received: from localhost.daemonmail.net (localhost.daemonmail.net [127.0.0.1]) by mx-out.daemonmail.net (8.12.9p2/8.12.9) with SMTP id i733iKaw096046; Mon, 2 Aug 2004 20:44:20 -0700 (PDT) (envelope-from rosennej@qsm.co.il) Received: from [217.132.195.51] (via account qsm.co.il) by mx-out.daemonmail.net with ESMTP id IyO0o2V2 authenticated by POP; Mon, 02 Aug 2004 20:44:19 -0700 (PDT) From: "Jony Rosenne" To: "Unicore" Cc: Subject: [hebrew] UTC - Holam proposals Date: Tue, 3 Aug 2004 06:44:24 +0300 Message-ID: <000a01c4790c$32a689c0$0401c80a@QSM4> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook, Build 10.0.6626 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1441 In-Reply-To: Importance: Normal Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by unicode.org id i733iNZV009154 X-archive-position: 1922 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: rosennej@qsm.co.il Precedence: bulk X-list: hebrew > -----Original Message----- > From: hebrew-bounce@unicode.org > [mailto:hebrew-bounce@unicode.org] On Behalf Of Michael Everson > Sent: Tuesday, August 03, 2004 4:40 AM > To: hebrew@unicode.org > Cc: Peter Constable; Ken Whistler > Subject: [hebrew] Re: Holam background document > > > At 15:04 -0700 2004-08-02, John Hudson wrote: > > >As far as I'm concerned, we're trying to correct a mistake in the > >Hebrew block. There is only *one* logical way to do this that is > >perfectly consistent with the character/glyph model and the identity > >of the dot on holam male. That is to separately encode the dot for > >holam male character that should have been encoded in the first > >place. > > I disagree, John. Ada Yardeni says quite explicitly: > > "Holam [is] a dot above a Waw or to the left of the upper corner of > any letter." > > and > > "The Holam on the Waw should be placed above the centre of its > "head", while the Holam on other letters should be placed to the left > of their upper corners." > > Yardeni has described perfectly what I judge to be consensus on this > list. She describes the default position of the POINT HOLAM on > consonants, and describes its default position on VAV in particular. > Statistics have certainly shown that this description is the best, > and that holam male is the default use of POINT HOLAM on VAV. It is quite clear that she ignores Vav Haluma. The Holam of Vav Haluma is just as it is for other letters, above left, and is not distinct in any way. > > >It should be perfectly obvious to anyone that this rejection forces > >us into a position of compromise, because whatever solution is > >selected will not be the one logical solution that is consistent > >with the character/glyph model and the identity of holam male dot as > >a separate character. > > I disagree. I believe that when the tradition moves the dot it is > creating a new character, whether or not the dot has a similar > function. Of course, a dot is a dot. But n-with-dot-above is used to > represent the velar nasal consonant in Latin transliterations of > Brahmic scripts, and n-with-dot-below is used to represent the > retroflex nasal consonant. Moving the dot makes a difference, and it > makes sense to encode the two characters separately. > > Similarly we have the case in Hebrew. The tradition agrees that the > position of the dot has significance. The tradition -- and I consider > Yardeni to be authoritative in speaking for the tradition -- suggests > quite clearly what the default behaviour of HOLAM is. This is based on a superficial and incorrect understanding of the text and of the Hebrew script in general. > > The tradition places a dot in a different place when it has a > different meaning. It appears to be a unique solution, intended for > use only with a particular letter, which is why Mark and I called it > POINT HOLAM HASER FOR VAV. > > >It is precisely because the solution will be a compromise that no > >one is going to be completely happy with the result, but also why > >there is no point on standing on principle or launching objections > >from fundamentals that have already been compromised. We need a > >solution that works reliably and gets the job done, and that's all > >we can really hope for. Purity isn't on the table. > > The joiner "solution" is an ill-conceived hack which tries to press > the ZWNJ, whose behaviour is appropriate for cursive scripts (like > Arabic or Brahmic scripts), into service for the non-cursive Hebrew > script, on foot of a presumption that Hebrew points form ligatures > with Hebrew base-letters. If "purity" is to be considered, the > intrinsic cursivity or non-cursivity of a script, and indeed the > concept of "ligature", should be taken into account. The Latin-Greek-Cyrillic scripts have a very different attitude to combining marks when compare to Hebrew and Arabic. The joiner solution is probably a hack, but I don't see why it is ill-conceived. If the behavior is accepted by Unicode for one script, it may be used for another. I don't think it is valid to categorize scripts to cursive and non-cursive, and if one were to categorize scripts in respect to the way they handle combining marks I would think that Hebrew (and Arabic, for different reasons) would have their own categories. > > The HOLAM HASER FOR VAV proposal is simple. It recognizes that HEBREW > POINT HOLAM is the default character used for a dot above > representing [o] in the Hebrew script. It recognizes that there are > many implementations which do not distinguish VAV + HOLAM graphically > when it could mean either [o] or [vo]. It recognizes that, in > implementations which *do* make such a distinction, that the dot for > VAV + HOLAM with the meaning [o] is centred over the VAV, rather than > positioned to the left as it is for all other characters, and it > recognizes that the dot for VAV + HOLAM with the meaning [vo] is > positioned to the left. It recognizes that the left positioning of > the HOLAM over VAV is a *marked* positioning, and it is for this > reason that a new character was proposed for this particular usage. There is no reason to distinguish between a Holam Haser point on a Vav and a Holam Haser point on any other letter. It has the same appearance and the same semantics. There is no basis for such a distinction. Unicode does not encode glyphs, it encodes characters. A Holam Haser point is a Holam Haser point, always. The problem we have is with the Holam Male, and should be solved in that context. The first proposal attempts to do this. The UTC should decide whether this is an acceptable use of ZWNJ (or ZWJ) or not, and if it is there is no need for a disruptive new character. > > This is completely analogous to the model accepted by the UTC and WG2 > for the character QAMATS QATAN. The proposal for that character > recognized that there are many implementations which do not > distinguish QAMATS when it could mean either [a] or [o]. It > recognized that in implementations which *do* make such a > distinction, that the shape for QAMATS QATAN with the meaning [o] is > different from the normal shape for QAMATS, and it recognized that > the shape for QAMATS QATAN with the meaning [o] is larger, or > otherwise differently shaped. It recognized that the special shape of > the QAMATS QATAN is a *marked* shape, and it is for this reason that > a new character was proposed for this particular usage. The Qamats proposal and the UTC did not address interchange between users who make the distinction and those who do not. Qamats Qatan should have at least a compatibility decomposition to Qamats. > > The fact that the distinction between QAMATS and QAMATS QATAN was > made in the twentieth century, and that the distinction between HOLAM > and HOLAM HASER FOR VAV was made in the eleventh century, is > irrelevant to the logic which suggests that both characters indicate > a marked form different from a default unmarked form. Both solutions > use the same logic for differentiating between an marked and an > unmarked representation for Hebrew text. But there is no distinction between Holam Haser for Vav and, for example, Holam Haser for Yod. Yet the proposal would make them different. This proposal should not be accepted. The UTC should either accept the alternative proposal, using ZWNJ, as a reasonable compromise which has been extensively and intensively discussed and accepted by most Hebrew users, or accept neither and request a better compromise. Since the issue is 1100 years old, a few more months will not be so significant, relatively speaking. Jony > -- > Michael Everson * * Everson Typography * * http://www.evertype.com > > > From tiro@tiro.com Mon Aug 2 23:25:25 2004 Received: with ECARTIS (v1.0.0; list hebrew); Mon, 02 Aug 2004 23:25:48 -0500 (CDT) Received: from priv-edtnes28.telusplanet.net (outbound04.telus.net [199.185.220.223]) by unicode.org (8.12.11/8.12.11) with ESMTP id i734PLxE017878; Mon, 2 Aug 2004 23:25:24 -0500 Received: from tiro.com ([154.5.29.179]) by priv-edtnes28.telusplanet.net (InterMail vM.6.01.03.02 201-2131-111-104-20040324) with ESMTP id <20040803042513.BYHQ6727.priv-edtnes28.telusplanet.net@tiro.com>; Mon, 2 Aug 2004 22:25:13 -0600 Message-ID: <410F137C.9020102@tiro.com> Date: Mon, 02 Aug 2004 21:24:28 -0700 From: John Hudson User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.6b) Gecko/20031205 Thunderbird/0.4 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Jony Rosenne CC: Unicore , hebrew@unicode.org Subject: [hebrew] Re: UTC - Holam proposals References: <000a01c4790c$32a689c0$0401c80a@QSM4> In-Reply-To: <000a01c4790c$32a689c0$0401c80a@QSM4> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 1923 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: tiro@tiro.com Precedence: bulk X-list: hebrew Jony Rosenne wrote: > There is no reason to distinguish between a Holam Haser point on a Vav and a > Holam Haser point on any other letter. It has the same appearance and the > same semantics. There is no basis for such a distinction. Unicode does not > encode glyphs, it encodes characters. A Holam Haser point is a Holam Haser > point, always. > > The problem we have is with the Holam Male, and should be solved in that > context. The first proposal attempts to do this. The UTC should decide > whether this is an acceptable use of ZWNJ (or ZWJ) or not, and if it is > there is no need for a disruptive new character. Actually, Peter's proposal to use ZWNJ does not attempt to solve the problem in the context of holam male, it attempts to solve the problem in the context of vav haluma, i.e. it treats vav haluma as the variant form that needs special handling. There are practical reasons for doing this, and Michael also believes that there are logical reasons. I don't really care about the logical reasons, since most people seem to be swayed by the practical reasons, which was my point about compromise. Michael has argued that making the distinction in vav haluma instead of holam male is not a compromise, and I don't see any point in arguing this. Whether it is a compromise or not, it is what both proposals do. John Hudson -- Tiro Typeworks www.tiro.com Vancouver, BC tiro@tiro.com Currently reading: The Mass in slow motion, by Ronald Knox Hebrew manuscripts of the Middle Ages, by Colette Sirat Breaking the South Slav dream, by Kate Hudson From ted.hopp@newslate.com Tue Aug 3 00:40:16 2004 Received: with ECARTIS (v1.0.0; list hebrew); Tue, 03 Aug 2004 00:40:16 -0500 (CDT) Received: from smtp03.mrf.mail.rcn.net (smtp03.mrf.mail.rcn.net [207.172.4.62]) by unicode.org (8.12.11/8.12.11) with ESMTP id i735eFhV006664 for ; Tue, 3 Aug 2004 00:40:16 -0500 Received: from 216-164-48-205.c3-0.gth-ubr1.lnh-gth.md.cable.rcn.com ([216.164.48.205] helo=Xerxes) by smtp03.mrf.mail.rcn.net with smtp (Exim 3.35 #7) id 1Brs23-0006cG-00 for hebrew@unicode.org; Tue, 03 Aug 2004 01:40:15 -0400 Message-ID: <018e01c4791c$5747e070$deeefea9@Xerxes> From: "Ted Hopp" To: References: <000301c47273$c2717f00$0100000a@QSM4> <030d01c47277$ab2ca140$3278fe51@VENGEROV> <41055148.6030402@tiro.com> <41056E86.6070102@qaya.org> <4106AB5B.2060604@tiro.com> <4106D619.5050203@qaya.org> <41071CEE.4050005@tiro.com> <20040728034816.GD11028@ccil.org> <4107279B.7040705@tiro.com> <41077B05.4000208@qaya.org> <410843B7.2070900@tiro.com> <4108C0F4.8030301@qaya.org> <410940B0.9080304@tiro.com> <410985FF.7040105@qaya.org> <41099580.9090303@tiro.com> <410A3BE5.7010808@qaya.org> <410AB4E7.5020903@tiro.com> <018801c47885$89c2e050$6801a8c0@VENGEROV> <410E81EF.6090906@tiro.com> <410EB495.3080403@qaya.org> <410EBA58.9040908@tiro.com> Subject: [hebrew] Re: Holam background document Date: Tue, 3 Aug 2004 01:40:09 -0400 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1437 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1441 X-archive-position: 1924 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: ted.hopp@newslate.com Precedence: bulk X-list: hebrew On Monday, August 02, 2004 9:40 PM, Michael Everson wrote: > At 15:04 -0700 2004-08-02, John Hudson wrote: > >As far as I'm concerned, we're trying to correct a mistake in the > >Hebrew block. There is only *one* logical way to do this that is > >perfectly consistent with the character/glyph model and the identity > >of the dot on holam male. That is to separately encode the dot for > >holam male character that should have been encoded in the first > >place. > > I disagree, John. Ada Yardeni says quite explicitly: > > "Holam [is] a dot above a Waw or to the left of the upper corner of > any letter." > > and > > "The Holam on the Waw should be placed above the centre of its > "head", while the Holam on other letters should be placed to the left > of their upper corners." > > Yardeni has described perfectly what I judge to be consensus on this > list. She describes the default position of the POINT HOLAM on > consonants, and describes its default position on VAV in particular. > Statistics have certainly shown that this description is the best, > and that holam male is the default use of POINT HOLAM on VAV. There is a strong possibility that you are misinterpreting Yardeni here. Your argument, I believe, is that she is going from the general rule to the specific, and the specific rule overrides the general within its domain of applicability. This is a fine talmudic principle, but not the only one available! There is nothing in what you quote that contradicts a different view: that she is describing an abstraction (holam) of two distict object classes (holam male and holam haser); in other words, that she is describing two types of holam. One type is "a dot above a Waw" [i.e., holam male] and the other is a dot "to the left of the upper corner of any letter." [i.e., a holam haser]. Note that she didn't say "any *other* letter" as she seems to be saying in the second quote, so one could use this citation in support of the view that there is no "default position"; that a distinction should *always* be made. Holam haser (on a vav or any other letter) and holam male are the two types of holam. That's how I read the first quote. Regarding the second quote, it must be parsed in the context established by the first: that there are two distinct types of holam. Thus (it can be argued), it should be read as, "The [dot for holam male] should be placed above the centre of [the head of the vav-for-holam-male], while the [dot for holam haser on any letter] should be placed to the left of their upper corners." I don't read anything about "default position" in what you cite. To the contrary, I read this as that the default behavior is for there to be two positions. (Alternatively, one could read it as that a holam on a vav is *always* centered--regardless of the type of holam. Certainly that is not the consensus view on this list, nor, in my opinion, a correct reading of Yardeni.) >... The tradition -- and I consider > Yardeni to be authoritative in speaking for the tradition -- suggests > quite clearly what the default behaviour of HOLAM is. First, I would not be so quick to use the definite article when talking of "tradition"; we clearly have examples of more than one tradition. I don't know exactly where your quotes came from (The Book of Hebrew Script?), but could it not be that she is describing her views (informed by tradition) of proper caligraphy, not making pronouncements on "the tradition" itself? Second, if one wants to cite authorities, it would do to also cite from Gershon Silberberg's book "Principles of Printing" (Irgun Mifale haDefus be-Yisrael, Tel Aviv, 1968), which, interestingly, includes a discussion of the use of two types of vav with holam. The book has contributions by Moshe Spitzer (author of the entry "Typography" in the Encyclopedia Judaica), Meir Ben-Yehuda, Shmuel Perez, and Arie Lotan. Maybe I could find a nice quote to back up my idea for use of a variation selector on the vav to solve our little problem. :) Ted Ted Hopp, Ph.D. ZigZag, Inc. ted.hopp@newSLATE.com +1-301-990-7453 newSLATE is your personal learning workspace ...on the web at http://www.newSLATE.com/ From smontagu@smontagu.org Tue Aug 3 01:24:27 2004 Received: with ECARTIS (v1.0.0; list hebrew); Tue, 03 Aug 2004 01:24:27 -0500 (CDT) Received: from sack.dreamhost.com (postfix@sack.dreamhost.com [66.33.213.6]) by unicode.org (8.12.11/8.12.11) with ESMTP id i736OQLu000550 for ; Tue, 3 Aug 2004 01:24:27 -0500 Received: from [127.0.0.1] (unknown [192.117.119.192]) by sack.dreamhost.com (Postfix) with ESMTP id CDA0313D843; Mon, 2 Aug 2004 23:24:23 -0700 (PDT) Message-ID: <410F2F67.1040001@smontagu.org> Date: Tue, 03 Aug 2004 09:23:35 +0300 From: Simon Montagu User-Agent: Mozilla Thunderbird 0.7 (Windows/20040616) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Michael Everson Cc: hebrew@unicode.org, Peter Constable , Ken Whistler Subject: [hebrew] Re: Holam background document References: <000301c47273$c2717f00$0100000a@QSM4> <030d01c47277$ab2ca140$3278fe51@VENGEROV> <41055148.6030402@tiro.com> <41056E86.6070102@qaya.org> <4106AB5B.2060604@tiro.com> <4106D619.5050203@qaya.org> <41071CEE.4050005@tiro.com> <20040728034816.GD11028@ccil.org> <4107279B.7040705@tiro.com> <41077B05.4000208@qaya.org> <410843B7.2070900@tiro.com> <4108C0F4.8030301@qaya.org> <410940B0.9080304@tiro.com> <410985FF.7040105@qaya.org> <41099580.9090303@tiro.com> <410A3BE5.7010808@qaya.org> <410AB4E7.5020903@tiro.com> <018801c47885$89c2e050$6801a8c0@VENGEROV> <410E81EF.6090906@tiro.com> <410EB495.3080403@qaya.org> <410EBA58.9040908@tiro.com> In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 1925 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: smontagu@smontagu.org Precedence: bulk X-list: hebrew Michael Everson wrote: > Ada Yardeni says quite explicitly: > > "Holam [is] a dot above a Waw or to the left of the upper corner of any > letter." > > and > > "The Holam on the Waw should be placed above the centre of its "head", > while the Holam on other letters should be placed to the left of their > upper corners." > > Yardeni has described perfectly what I judge to be consensus on this > list. She describes the default position of the POINT HOLAM on > consonants, and describes its default position on VAV in particular. > Statistics have certainly shown that this description is the best, and > that holam male is the default use of POINT HOLAM on VAV. You are quoting from Part 4, Chapter 3 of "The Book of Hebrew Script", "The Designing of Inscriptions and Typefaces". In this chapter Yardeni doesn't claim to be describing any kind of normative model for Hebrew script: on page 308 she characterizes the contents of the chapter as "A few practical suggestions, based on my experience". From rosennej@qsm.co.il Tue Aug 3 01:40:46 2004 Received: with ECARTIS (v1.0.0; list hebrew); Tue, 03 Aug 2004 01:41:50 -0500 (CDT) Received: from mx-out.daemonmail.net (mx-out.daemonmail.net [216.104.160.39]) by unicode.org (8.12.11/8.12.11) with ESMTP id i736egTW002895; Tue, 3 Aug 2004 01:40:46 -0500 Received: from localhost.daemonmail.net (localhost.daemonmail.net [127.0.0.1]) by mx-out.daemonmail.net (8.12.9p2/8.12.9) with SMTP id i736efaw034938; Mon, 2 Aug 2004 23:40:41 -0700 (PDT) (envelope-from rosennej@qsm.co.il) Received: from [217.132.29.1] (via account qsm.co.il) by mx-out.daemonmail.net with ESMTP id P590v9F2 authenticated by POP; Mon, 02 Aug 2004 23:40:40 -0700 (PDT) From: "Jony Rosenne" To: "Hebrew List" Cc: "'Unicode List'" Subject: [hebrew] Re: Holam (was Errors in TUS Figure 15.2?) Date: Tue, 3 Aug 2004 09:40:54 +0300 Message-ID: <000e01c47924$d4bff530$0401c80a@QSM4> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook, Build 10.0.6626 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1441 In-Reply-To: <003c01c4791e$34277f40$030aa8c0@DEWELL> Importance: Normal X-archive-position: 1926 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: rosennej@qsm.co.il Precedence: bulk X-list: hebrew The same applies to recent arguments raised concerning the Holam and Vav and the philosophical nature of the ways they combine. Jony > -----Original Message----- > From: unicode-bounce@unicode.org > [mailto:unicode-bounce@unicode.org] On Behalf Of Doug Ewell > Sent: Tuesday, August 03, 2004 8:53 AM > To: Peter Kirk; Antoine Leca > Cc: Unicode List > Subject: Re: Errors in TUS Figure 15.2? > > > Peter Kirk wrote: > > > The situation is even more confused in that some Unicode > characters, > > e.g. U+0152 LATIN CAPITAL LIGATURE OE, are called LIGATUREs > in their > > character names but are unambiguously single Unicode > characters (e.g. > > they have no decomposition even for compatibility). (These are in > > addition to the characters named LIGATURE in the Alphabetic > > Presentation Forms block, which mostly have compatibility > > decompositions.) > > The last thing you want to worry about is the correlation > between whether a character has the word LIGATURE in its name > and whether it is actually a ligature. That way lies madness. > > -Doug Ewell > Fullerton, California > http://users.adelphia.net/~dewell/ > > > > > From peterkirk@qaya.org Tue Aug 3 05:06:58 2004 Received: with ECARTIS (v1.0.0; list hebrew); Tue, 03 Aug 2004 05:06:58 -0500 (CDT) Received: from pan.hu-pan.com (hu-pan.com [67.15.6.3]) by unicode.org (8.12.11/8.12.11) with ESMTP id i73A6sRW009727; Tue, 3 Aug 2004 05:06:58 -0500 Received: from [213.162.124.237] (helo=[127.0.0.1]) by pan.hu-pan.com with asmtp (Exim 4.34) id 1BrwC6-0003nh-4R; Tue, 03 Aug 2004 11:06:54 +0100 Message-ID: <410F63C2.3060506@qaya.org> Date: Tue, 03 Aug 2004 11:06:58 +0100 From: Peter Kirk User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7) Gecko/20040616 X-Accept-Language: en-gb, en, en-us, az, ru, tr, he, el, fr, de MIME-Version: 1.0 To: Jony Rosenne CC: Hebrew List , "'Unicode List'" Subject: [hebrew] Re: Holam (was Errors in TUS Figure 15.2?) References: <000e01c47924$d4bff530$0401c80a@QSM4> In-Reply-To: <000e01c47924$d4bff530$0401c80a@QSM4> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - pan.hu-pan.com X-AntiAbuse: Original Domain - unicode.org X-AntiAbuse: Originator/Caller UID/GID - [0 0] / [47 12] X-AntiAbuse: Sender Address Domain - qaya.org X-Source: X-Source-Args: X-Source-Dir: X-archive-position: 1927 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: peterkirk@qaya.org Precedence: bulk X-list: hebrew On 03/08/2004 07:40, Jony Rosenne wrote: >The same applies to recent arguments raised concerning the Holam and Vav and >the philosophical nature of the ways they combine. > >Jony > > > Agreed. If the proposed encoding with ZWNJ does what is needed (or should do when implementations are updated to support TUS 4.0.1), let's stop philosophical objections and accept it. If it really does not, we need to look elsewhere. -- Peter Kirk peter@qaya.org (personal) peterkirk@qaya.org (work) http://www.qaya.org/ From peterkirk@qaya.org Tue Aug 3 05:48:31 2004 Received: with ECARTIS (v1.0.0; list hebrew); Tue, 03 Aug 2004 05:48:31 -0500 (CDT) Received: from pan.hu-pan.com (hu-pan.com [67.15.6.3]) by unicode.org (8.12.11/8.12.11) with ESMTP id i73AmVm4020348 for ; Tue, 3 Aug 2004 05:48:31 -0500 Received: from [213.162.124.237] (helo=[127.0.0.1]) by pan.hu-pan.com with asmtp (Exim 4.34) id 1BrwqM-0004pR-QE; Tue, 03 Aug 2004 11:48:31 +0100 Message-ID: <410F6D84.4030601@qaya.org> Date: Tue, 03 Aug 2004 11:48:36 +0100 From: Peter Kirk User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7) Gecko/20040616 X-Accept-Language: en-gb, en, en-us, az, ru, tr, he, el, fr, de MIME-Version: 1.0 To: Michael Everson CC: hebrew@unicode.org, Peter Constable , Ken Whistler Subject: [hebrew] Re: Holam background document References: <000301c47273$c2717f00$0100000a@QSM4> <030d01c47277$ab2ca140$3278fe51@VENGEROV> <41055148.6030402@tiro.com> <41056E86.6070102@qaya.org> <4106AB5B.2060604@tiro.com> <4106D619.5050203@qaya.org> <41071CEE.4050005@tiro.com> <20040728034816.GD11028@ccil.org> <4107279B.7040705@tiro.com> <41077B05.4000208@qaya.org> <410843B7.2070900@tiro.com> <4108C0F4.8030301@qaya.org> <410940B0.9080304@tiro.com> <410985FF.7040105@qaya.org> <41099580.9090303@tiro.com> <410A3BE5.7010808@qaya.org> <410AB4E7.5020903@tiro.com> <018801c47885$89c2e050$6801a8c0@VENGEROV> <410E81EF.6090906@tiro.com> <410EB495.3080403@qaya.org> <410EBA58.9040908@tiro.com> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - pan.hu-pan.com X-AntiAbuse: Original Domain - unicode.org X-AntiAbuse: Originator/Caller UID/GID - [0 0] / [47 12] X-AntiAbuse: Sender Address Domain - qaya.org X-Source: X-Source-Args: X-Source-Dir: X-archive-position: 1928 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: peterkirk@qaya.org Precedence: bulk X-list: hebrew On 03/08/2004 02:40, Michael Everson wrote: > At 15:04 -0700 2004-08-02, John Hudson wrote: > >> As far as I'm concerned, we're trying to correct a mistake in the >> Hebrew block. There is only *one* logical way to do this that is >> perfectly consistent with the character/glyph model and the identity >> of the dot on holam male. That is to separately encode the dot for >> holam male character that should have been encoded in the first place. > > > I disagree, John. Ada Yardeni says quite explicitly: > > "Holam [is] a dot above a Waw or to the left of the upper corner of > any letter." > > and > > "The Holam on the Waw should be placed above the centre of its "head", > while the Holam on other letters should be placed to the left of their > upper corners." It is clear that Yardeni is describing what we have agreed to call "less exact typography" and is ignoring the significant minority of pointed Hebrew printing which distinguishes two positions of Holam on Vav. > > Yardeni has described perfectly what I judge to be consensus on this > list. ... On the contrary. If there is one thing on which there is consensus on this list, it is that two position of Holam on Vav should be distinguished, at least optionally. But Yardeni's description does not address this point at all. > ... > > Similarly we have the case in Hebrew. The tradition agrees that the > position of the dot has significance. The tradition -- and I consider > Yardeni to be authoritative in speaking for the tradition -- suggests > quite clearly what the default behaviour of HOLAM is. It is clear that the user community does not consider Yardeni to be authoritative on this, and indeed that Yardeni makes no claim to be authoritative. The best way to find the authoritative position is to listen to what many Hebrew users are telling you rather than relying on a book. Anyway, it is quite illogical to appeal to Yardeni on the significance of different positions of the dot on Vav when Yardeni does not mention different positions. ... > > The HOLAM HASER FOR VAV proposal is simple. It recognizes that HEBREW > POINT HOLAM is the default character used for a dot above representing > [o] in the Hebrew script. It recognizes that there are many > implementations which do not distinguish VAV + HOLAM graphically when > it could mean either [o] or [vo]. It recognizes that, in > implementations which *do* make such a distinction, that the dot for > VAV + HOLAM with the meaning [o] is centred over the VAV, rather than > positioned to the left as it is for all other characters, and it > recognizes that the dot for VAV + HOLAM with the meaning [vo] is > positioned to the left. ... Let's clarify this. In this latter combination the dot still has meaning [o], and the Vav has meaning [v]. In Holam Male, the dot again has meaning [o]. The Holam dot has only one meaning in all contexts, although the meaning of the Vav changes. > ... It recognizes that the left positioning of the HOLAM over VAV is a > *marked* positioning, and it is for this reason that a new character > was proposed for this particular usage. Agreed that this is the marked position. Not agreed that a new character is the best way to indicate this. A new character is not appropriate for a glyph variant, according to the character/glyph model. > > This is completely analogous to the model accepted by the UTC and WG2 > for the character QAMATS QATAN. The proposal for that character > recognized that there are many implementations which do not > distinguish QAMATS when it could mean either [a] or [o]. It recognized > that in implementations which *do* make such a distinction, that the > shape for QAMATS QATAN with the meaning [o] is different from the > normal shape for QAMATS, and it recognized that the shape for QAMATS > QATAN with the meaning [o] is larger, or otherwise differently shaped. > It recognized that the special shape of the QAMATS QATAN is a *marked* > shape, and it is for this reason that a new character was proposed for > this particular usage. This analogy breaks down in several ways. The most obvious is that QAMATS QATAN has a different shape from QAMATS, but HOLAM HASER FOR VAV is graphically identical to HOLAM when both are used with the Holam Haser function. Also I don't think QAMATS QATAN is pronounced [o] like HOLAM, rather more like [ɔ] (open o), but this may depend on the exact tradition. > > The fact that the distinction between QAMATS and QAMATS QATAN was made > in the twentieth century, and that the distinction between HOLAM and > HOLAM HASER FOR VAV was made in the eleventh century, ... Tenth, or earlier, actually. The Aleppo codex, represented in your proposal, dates from c.930 CE. -- Peter Kirk peter@qaya.org (personal) peterkirk@qaya.org (work) http://www.qaya.org/ From peterkirk@qaya.org Tue Aug 3 05:59:14 2004 Received: with ECARTIS (v1.0.0; list hebrew); Tue, 03 Aug 2004 05:59:14 -0500 (CDT) Received: from pan.hu-pan.com (hu-pan.com [67.15.6.3]) by unicode.org (8.12.11/8.12.11) with ESMTP id i73AxEUk027878 for ; Tue, 3 Aug 2004 05:59:14 -0500 Received: from [213.162.124.237] (helo=[127.0.0.1]) by pan.hu-pan.com with asmtp (Exim 4.34) id 1Brx0j-00054T-9l; Tue, 03 Aug 2004 11:59:13 +0100 Message-ID: <410F7007.4070406@qaya.org> Date: Tue, 03 Aug 2004 11:59:19 +0100 From: Peter Kirk User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7) Gecko/20040616 X-Accept-Language: en-gb, en, en-us, az, ru, tr, he, el, fr, de MIME-Version: 1.0 To: John Hudson CC: Jony Rosenne , hebrew@unicode.org Subject: [hebrew] Re: UTC - Holam proposals References: <000a01c4790c$32a689c0$0401c80a@QSM4> <410F137C.9020102@tiro.com> In-Reply-To: <410F137C.9020102@tiro.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - pan.hu-pan.com X-AntiAbuse: Original Domain - unicode.org X-AntiAbuse: Originator/Caller UID/GID - [0 0] / [47 12] X-AntiAbuse: Sender Address Domain - qaya.org X-Source: X-Source-Args: X-Source-Dir: X-archive-position: 1929 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: peterkirk@qaya.org Precedence: bulk X-list: hebrew On 03/08/2004 05:24, John Hudson wrote: > ... > Actually, Peter's proposal to use ZWNJ does not attempt to solve the > problem in the context of holam male, it attempts to solve the problem > in the context of vav haluma, i.e. it treats vav haluma as the variant > form that needs special handling. There are practical reasons for > doing this, and Michael also believes that there are logical reasons. > I don't really care about the logical reasons, since most people seem > to be swayed by the practical reasons, which was my point about > compromise. Michael has argued that making the distinction in vav > haluma instead of holam male is not a compromise, and I don't see any > point in arguing this. Whether it is a compromise or not, it is what > both proposals do. True enough. I don't believe in the logical reasons but I do in the practical ones. And we are more or less agreed that this kind of compromise is the right way to go. However, the practical as well as theoretical consequences of such a compromise need to be minimised. If A and B are really the same thing but have to be disunified as a compromise for a specific situation, the ill effects are minimised if the new encodings for A and B are as similar as possible. For example, they may be distinguished by default ignorable characters, which will be automatically ignored by processes which do not need to support the distinction. In this case, since variation selectors are ruled out, the minimum effects are those resulting from making the distinction with ZWNJ or ZWJ. -- Peter Kirk peter@qaya.org (personal) peterkirk@qaya.org (work) http://www.qaya.org/ From kfeuerherm@wlu.ca Tue Aug 3 07:12:46 2004 Received: with ECARTIS (v1.0.0; list hebrew); Tue, 03 Aug 2004 07:12:47 -0500 (CDT) Received: from wlu.ca (wluw5.wlu.ca [192.54.242.118]) by unicode.org (8.12.11/8.12.11) with SMTP id i73CCkfw010848 for ; Tue, 3 Aug 2004 07:12:46 -0500 Received: from WLU-MTA by wlu.ca with Novell_GroupWise; Tue, 03 Aug 2004 08:12:45 -0400 Message-Id: X-Mailer: Novell GroupWise Internet Agent 6.5.2 Beta Date: Tue, 03 Aug 2004 08:12:28 -0400 From: "Karljurgen Feuerherm" To: , Cc: , , , Subject: [hebrew] Re: Holam background document Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 1930 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: kfeuerherm@wlu.ca Precedence: bulk X-list: hebrew >>> John Hudson 02/08/2004 6:04:08 pm >>> >Peter Kirk wrote: >> If every Arabic presentation form and every Indic ligature had been >> encoded as a separate character, that would have simplified rendering a >> great deal, and avoided all sorts of nasty problems like the ones in >> Public Review Issue #37. But Unicode chose instead the character/glyph >> model. It should stick with it. >To me it is perfectly obvious that the dot used in holam male should never have been unified with the holam haser: This seems rather extreme to me... > these are separate characters according to pretty much every criteria that >Unicode has ever used to determine character identity. although I can see why they CAN be considered separate characters by those criteria. >As far as I'm concerned, we're trying to correct a mistake in the Hebrew block. There is >only *one* logical way to do this that is perfectly consistent with the character/glyph >model and the identity of the dot on holam male. That is to separately encode the dot for >holam male character that should have been encoded in the first place. But, from a practical standpoint (I accept what you say about the need for compromise), I am not sure how this would have differed from the one discussed below? Isn't the alternate proposal precisely to do this? (i.e. I would appreciate a clarification). >This solution has been rejected because of existing data that uses the existing holam dot >for holam male, the relative frequency of holam male, and the acceptance of the holam male >formation as an ambiguous rendering by 'less exact' typographers. >It should be perfectly obvious to anyone that this rejection forces us into a position of >compromise, because whatever solution is selected will not be the one logical solution >that is consistent with the character/glyph model and the identity of holam male dot as a >separate character. >It is precisely because the solution will be a compromise that no one is going to be >completely happy with the result, but also why there is no point on standing on principle >or launching objections from fundamentals that have already been compromised. We need a >solution that works reliably and gets the job done, and that's all we can really hope for. >Purity isn't on the table. This is most certainly true. K From kfeuerherm@wlu.ca Tue Aug 3 07:25:04 2004 Received: with ECARTIS (v1.0.0; list hebrew); Tue, 03 Aug 2004 07:25:04 -0500 (CDT) Received: from wlu.ca (wluw5.wlu.ca [192.54.242.118]) by unicode.org (8.12.11/8.12.11) with SMTP id i73CP4gf012930 for ; Tue, 3 Aug 2004 07:25:04 -0500 Received: from WLU-MTA by wlu.ca with Novell_GroupWise; Tue, 03 Aug 2004 08:25:03 -0400 Message-Id: X-Mailer: Novell GroupWise Internet Agent 6.5.2 Beta Date: Tue, 03 Aug 2004 08:24:49 -0400 From: "Karljurgen Feuerherm" To: , Cc: , , , Subject: [hebrew] Re: Holam background document Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 1931 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: kfeuerherm@wlu.ca Precedence: bulk X-list: hebrew >>> Peter Kirk 02/08/2004 6:49:30 pm >>> On 02/08/2004 23:04, John Hudson wrote: > Peter Kirk wrote: >And then Jony has already repeatedly objected to "exact" typography >being considered normative, and certainly to the deliberate decision not >to make the distinction being called "poorer quality". Less exact >typography is more common, and is a valid typographical choice. The >distinction is an optional one, and neither text providers nor fonts >should be obliged to support it. I fail to see how John has made that normative. It goes without saying that in order for it to be even optional it has to be guaranteed to work, which is the heart of his point. >I agree. But what is the job which needs to be done? Clearly the >majority of Hebrew users at least on this list don't accept that the new >character actually does the right job, I do not accept this. So far, I am convinced that they do not accept it because they do not like it. That is not the same thing, and not a legitimate argument if that is in fact the underlying reality. >Please, everyone, listen to >their views. Otherwise you will just end up with a new character which >no one will use, because the user community see the solution as worse >than the problem to be solved. This would be a waste of everyone's time. So long as it works, I will use it. And if everyone else wishes not to use it on principle, then that's their problem. What needs to be demonstrated now is that it WON'T work. And please don't bring up the existing data again. It will continue to work as well as it has so far (not perfectly in other words) and it has already been brought to our attention that conservatism re: existing data is not a legitimate argument. Fix involves change.... K From peterkirk@qaya.org Tue Aug 3 08:48:49 2004 Received: with ECARTIS (v1.0.0; list hebrew); Tue, 03 Aug 2004 08:48:49 -0500 (CDT) Received: from pan.hu-pan.com (hu-pan.com [67.15.6.3]) by unicode.org (8.12.11/8.12.11) with ESMTP id i73Dmmi6008450 for ; Tue, 3 Aug 2004 08:48:49 -0500 Received: from [213.162.124.237] (helo=[127.0.0.1]) by pan.hu-pan.com with asmtp (Exim 4.34) id 1Brzem-0002NW-St; Tue, 03 Aug 2004 14:48:45 +0100 Message-ID: <410F97C2.9040503@qaya.org> Date: Tue, 03 Aug 2004 14:48:50 +0100 From: Peter Kirk User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7) Gecko/20040616 X-Accept-Language: en-gb, en, en-us, az, ru, tr, he, el, fr, de MIME-Version: 1.0 To: Karljurgen Feuerherm CC: tiro@tiro.com, everson@evertype.com, petercon@microsoft.com, ken.whistler@sybase.com, hebrew@unicode.org Subject: [hebrew] Re: Holam background document References: In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - pan.hu-pan.com X-AntiAbuse: Original Domain - unicode.org X-AntiAbuse: Originator/Caller UID/GID - [0 0] / [47 12] X-AntiAbuse: Sender Address Domain - qaya.org X-Source: X-Source-Args: X-Source-Dir: X-archive-position: 1932 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: peterkirk@qaya.org Precedence: bulk X-list: hebrew On 03/08/2004 13:12, Karljurgen Feuerherm wrote: > ... > >>As far as I'm concerned, we're trying to correct a mistake in the >> >> >Hebrew block. There is > > >>only *one* logical way to do this that is perfectly consistent with >> >> >the character/glyph > > >>model and the identity of the dot on holam male. That is to separately >> >> >encode the dot for > > >>holam male character that should have been encoded in the first >> >> >place. > >But, from a practical standpoint (I accept what you say about the need >for compromise), I am not sure how this would have differed from the one >discussed below? > >Isn't the alternate proposal precisely to do this? (i.e. I would >appreciate a clarification). > > > John H's preferred alternative is one character for the dot in Holam Male only, and another character for all cases of Holam Haser. Theoretically neat, although not necessarily ideal. Everson and Shoulson's proposal is for one character both for the dot in Holam Male and for Holam Haser with every base character except for Vav, and another character for Holam Haser only when the base character is Vav. Theoretically a mess as the distinction is arbitrary, based neither on the glyph nor the semantics but only on the context (the base character). -- Peter Kirk peter@qaya.org (personal) peterkirk@qaya.org (work) http://www.qaya.org/ From rosennej@qsm.co.il Tue Aug 3 09:14:15 2004 Received: with ECARTIS (v1.0.0; list hebrew); Tue, 03 Aug 2004 09:14:15 -0500 (CDT) Received: from mx-out.daemonmail.net (mx-out.daemonmail.net [216.104.160.39]) by unicode.org (8.12.11/8.12.11) with ESMTP id i73EEBwc014474 for ; Tue, 3 Aug 2004 09:14:15 -0500 Received: from localhost.daemonmail.net (localhost.daemonmail.net [127.0.0.1]) by mx-out.daemonmail.net (8.12.9p2/8.12.9) with SMTP id i73EDnaw033594; Tue, 3 Aug 2004 07:13:55 -0700 (PDT) (envelope-from rosennej@qsm.co.il) Received: from [212.235.101.96] (via account qsm.co.il) by mx-out.daemonmail.net with ESMTP id kj80GoE2 authenticated by POP; Tue, 03 Aug 2004 07:13:47 -0700 (PDT) From: "Jony Rosenne" To: "'Karljurgen Feuerherm'" , , Cc: , , , Subject: [hebrew] Re: Holam background document Date: Tue, 3 Aug 2004 17:14:01 +0300 Message-ID: <000b01c47964$28a88600$0100000a@QSM4> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook, Build 10.0.6626 X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2800.1441 In-Reply-To: Importance: Normal Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by unicode.org id i73EEBwc014474 X-archive-position: 1933 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: rosennej@qsm.co.il Precedence: bulk X-list: hebrew > -----Original Message----- > From: hebrew-bounce@unicode.org > [mailto:hebrew-bounce@unicode.org] On Behalf Of Karljurgen Feuerherm > Sent: Tuesday, August 03, 2004 3:25 PM > To: peterkirk@qaya.org; tiro@tiro.com > Cc: everson@evertype.com; petercon@microsoft.com; > ken.whistler@sybase.com; hebrew@unicode.org > Subject: [hebrew] Re: Holam background document > > > > > >>> Peter Kirk 02/08/2004 6:49:30 pm >>> > On 02/08/2004 23:04, John Hudson wrote: > > > Peter Kirk wrote: ... > I do not accept this. So far, I am convinced that they do not > accept it because they do not like it. That is not the same > thing, and not a legitimate argument if that is in fact the > underlying reality. Karljurgen, I object to this. I don't recall anyone saying anything that could justify this remark. I for one, do not accept it because it a bad solution and wrong, and I had explained why. Also, it will create interchange problems not only with the existing data, which for some reason some people do not consider important, but also for new data. I consider the interchange of texts to one of the main aims of Unicode. Otherwise, we could just tell the users to use the PUA for any character they miss. Jony > > What needs to be demonstrated now is that it WON'T work. This is an unacceptable proposition. There are two alternatives, both will work, but one (HOLAM HASER for VAV) is bad, ill conceived and wrong, and the other a possibly acceptable compromise. > > And please don't bring up the existing data again. It will > continue to work as well as it has so far (not perfectly in > other words) and it has already been brought to our attention > that conservatism re: existing data is not a legitimate > argument. Fix involves change.... It isn't a fix, because nothing was broken, it is an additional distinction which was missing because there was no good solution. Now that the UTC has further defined the use of ZWNJ we have a reasonable proposal. Jony > > K > > > From peterkirk@qaya.org Tue Aug 3 09:23:39 2004 Received: with ECARTIS (v1.0.0; list hebrew); Tue, 03 Aug 2004 09:23:39 -0500 (CDT) Received: from pan.hu-pan.com (hu-pan.com [67.15.6.3]) by unicode.org (8.12.11/8.12.11) with ESMTP id i73ENdnt015621 for ; Tue, 3 Aug 2004 09:23:39 -0500 Received: from modem655.netkonect.net ([194.164.15.147] helo=[127.0.0.1]) by pan.hu-pan.com with asmtp (Exim 4.34) id 1Bs0CT-0003N5-7R; Tue, 03 Aug 2004 15:23:36 +0100 Message-ID: <410F9FE5.5040500@qaya.org> Date: Tue, 03 Aug 2004 15:23:33 +0100 From: Peter Kirk User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7) Gecko/20040616 X-Accept-Language: en-gb, en, en-us, az, ru, tr, he, el, fr, de MIME-Version: 1.0 To: Karljurgen Feuerherm CC: tiro@tiro.com, everson@evertype.com, petercon@microsoft.com, ken.whistler@sybase.com, hebrew@unicode.org Subject: [hebrew] Re: Holam background document References: In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - pan.hu-pan.com X-AntiAbuse: Original Domain - unicode.org X-AntiAbuse: Originator/Caller UID/GID - [0 0] / [47 12] X-AntiAbuse: Sender Address Domain - qaya.org X-Source: X-Source-Args: X-Source-Dir: X-archive-position: 1934 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: peterkirk@qaya.org Precedence: bulk X-list: hebrew On 03/08/2004 13:24, Karljurgen Feuerherm wrote: > > >>>>Peter Kirk 02/08/2004 6:49:30 pm >>> >>>> >>>> >On 02/08/2004 23:04, John Hudson wrote: > > > >>Peter Kirk wrote: >>And then Jony has already repeatedly objected to "exact" typography >>being considered normative, and certainly to the deliberate decision >> >> >not > > >>to make the distinction being called "poorer quality". Less exact >>typography is more common, and is a valid typographical choice. The >>distinction is an optional one, and neither text providers nor fonts >>should be obliged to support it. >> >> > >I fail to see how John has made that normative. It goes without saying >that in order for it to be even optional it has to be guaranteed to >work, which is the heart of his point. > > > If there is one character for Holam Haser and another for the dot in Holam Male, users need to decide which is which in order to represent texts correctly. What is needed, rather, is one representation which is the default and another to be used only for the marked case. Anyway, what is the sense of "guaranteed to work"? There is rather little in Unicode which is actually guaranteed or required to work in any conformant implementation; that doesn't include support for ZWNJ, but it doesn't even include distinct glyphs for clearly distinct characters: rendering with a fallback font is conformant. But Hebrew, as a complex script, will be supported properly only by rendering engines designed to work with complex scripts. I doubt if in practice anyone will write a general purpose rendering engine which supports Hebrew but not Arabic. The implication is that rendering engines which support Hebrew will already have to support ZWNJ, which is required for Arabic script (certainly for Persian), and so they should support the proposed Hebrew representation with ZWNJ. >>I agree. But what is the job which needs to be done? Clearly the >>majority of Hebrew users at least on this list don't accept that the >> >> >new > > >>character actually does the right job, >> >> > >I do not accept this. So far, I am convinced that they do not accept it >because they do not like it. That is not the same thing, and not a >legitimate argument if that is in fact the underlying reality. > > They do not accept it primarily because it goes against their intuitions about how the Hebrew script works. Such intuitions, many from people who have been reading Hebrew since early childhood and have been immersed in Hebrew literature all their life, are likely to be far more reliable than impressions taken from a single book. The likelihood is that these intuitions represent the underlying reality. > > >>Please, everyone, listen to >>their views. Otherwise you will just end up with a new character which >> >> > > > >>no one will use, because the user community see the solution as worse >> >> > > > >>than the problem to be solved. This would be a waste of everyone's >> >> >time. > >So long as it works, I will use it. And if everyone else wishes not to >use it on principle, then that's their problem. > >What needs to be demonstrated now is that it WON'T work. > > I'm not sure what "it" is here. If you refer to my proposal, this has not been demonstrated although it has been alleged. If you refer to the Everson and Shoulson proposal, I accept that it can be made to work if texts are represented by glyph codes rather than abstract characters, but then so would separate encoding of each Arabic contextual form and Indic ligature. If people don't use it, it will not be on principle but because they don't see the need for it. See my next paragraph. >And please don't bring up the existing data again. It will continue to >work as well as it has so far (not perfectly in other words) and it has >already been brought to our attention that conservatism re: existing >data is not a legitimate argument. Fix involves change.... > > > The problem is that, whether we like it or not, people won't fix things which they don't perceive to be broken. The current perception of most Hebrew computer users is that there is not a problem. is used for Holam Male. When Vav Haluma is needed, either it is not distinguished (following the majority of printed material), or, when more exact typography is required, ZWNJ or ZWJ is inserted. (For example, Mechon Mamre does this.) This is done because it *works* (at least well enough for non-experts in typography), *now*, at least with the only rendering engine most of these people are interested in i.e. Uniscribe, and probably with any rendering engine which can do a decent job with Hebrew. My proposal is effectively to make official what is the majority de facto practice. But even if it is not made official, many people will continue to use it. They will certainly continue to do so until fonts supporting the new character are distributed as widely as the existing fonts which support the current practice. By that time this solution is likely to be far too deeply entrenched to be changed. I am talking realities here. For reasons like this the Israeli national standards body looks like opposing the Everson and Shoulson proposal (see Jony's recent posting), because they have to reflect reality in their country. -- Peter Kirk peter@qaya.org (personal) peterkirk@qaya.org (work) http://www.qaya.org/ From kfeuerherm@wlu.ca Tue Aug 3 09:40:41 2004 Received: with ECARTIS (v1.0.0; list hebrew); Tue, 03 Aug 2004 09:40:41 -0500 (CDT) Received: from wlu.ca (wluw5.wlu.ca [192.54.242.118]) by unicode.org (8.12.11/8.12.11) with SMTP id i73Eecj3018403 for ; Tue, 3 Aug 2004 09:40:41 -0500 Received: from WLU-MTA by wlu.ca with Novell_GroupWise; Tue, 03 Aug 2004 10:40:37 -0400 Message-Id: X-Mailer: Novell GroupWise Internet Agent 6.5.2 Beta Date: Tue, 03 Aug 2004 10:39:57 -0400 From: "Karljurgen Feuerherm" To: Subject: [hebrew] Re: Holam background document Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 1935 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: kfeuerherm@wlu.ca Precedence: bulk X-list: hebrew Ah, I see, thanks. Mess, as far as I can see, mainly from a semantics point of view, though. Which Unicode-et-al has said is not a primary argument.... K >>> Peter Kirk 03/08/2004 9:48:50 am >>> On 03/08/2004 13:12, Karljurgen Feuerherm wrote: > ... > >>As far as I'm concerned, we're trying to correct a mistake in the >> >> >Hebrew block. There is > > >>only *one* logical way to do this that is perfectly consistent with >> >> >the character/glyph > > >>model and the identity of the dot on holam male. That is to separately >> >> >encode the dot for > > >>holam male character that should have been encoded in the first >> >> >place. > >But, from a practical standpoint (I accept what you say about the need >for compromise), I am not sure how this would have differed from the one >discussed below? > >Isn't the alternate proposal precisely to do this? (i.e. I would >appreciate a clarification). > > > John H's preferred alternative is one character for the dot in Holam Male only, and another character for all cases of Holam Haser. Theoretically neat, although not necessarily ideal. Everson and Shoulson's proposal is for one character both for the dot in Holam Male and for Holam Haser with every base character except for Vav, and another character for Holam Haser only when the base character is Vav. Theoretically a mess as the distinction is arbitrary, based neither on the glyph nor the semantics but only on the context (the base character). -- Peter Kirk peter@qaya.org (personal) peterkirk@qaya.org (work) http://www.qaya.org/ From kfeuerherm@wlu.ca Tue Aug 3 09:45:50 2004 Received: with ECARTIS (v1.0.0; list hebrew); Tue, 03 Aug 2004 09:45:50 -0500 (CDT) Received: from wlu.ca (wluw5.wlu.ca [192.54.242.118]) by unicode.org (8.12.11/8.12.11) with SMTP id i73EjogL020314 for ; Tue, 3 Aug 2004 09:45:50 -0500 Received: from WLU-MTA by wlu.ca with Novell_GroupWise; Tue, 03 Aug 2004 10:45:49 -0400 Message-Id: X-Mailer: Novell GroupWise Internet Agent 6.5.2 Beta Date: Tue, 03 Aug 2004 10:45:11 -0400 From: "Karljurgen Feuerherm" To: Subject: [hebrew] Re: UTC - Holam proposals Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 1936 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: kfeuerherm@wlu.ca Precedence: bulk X-list: hebrew >>> "Jony Rosenne" 03/08/2004 9:58:53 am >>> >I want to make the distinction, but in such a way that it will not cause >problems to all the various users. Ok, I agree mostly; I don't believe that we can provide any solution that won't cause problems to some users, though (of whatever type) so the issue then is to decide how much inconvenience to how many is acceptable--keeping in mind that in certain ways of drawing that line, it is implicit that there is no solution possible. >If we don't have a good solution, then we should work on it a bit more. If good means reasonably practical, then yes, I agree with you. I agree also with John H that 'purity' or to put it another way 'perfection' is no longer an option. >The HOLAM HASER for VAV solution is very bad. I'm open-minded here: discussion of why it's bad (other than that the existing data doesn't conform) would be good. >The ZWNJ solution is acceptable to me and to many others, but if it is not >acceptable to the UTC we all have to see what can be done, but this should >not be used as the reason to choose a bad solution because it is the only >alternative on the table. I would be happy to live with this if I were convinced that it would work as claimed; but a lot of discussion lately has suggested that it may not. I am happy for this to go to UTC as the final road test. I should be frank, though, in saying that all that I've heard so far from that direction suggests it will not pass.... But the proof of the pudding is in the eating. Never mind my doomsday prophecies.... :) K > -----Original Message----- > From: Karljurgen Feuerherm [mailto:kfeuerherm@wlu.ca] > Sent: Tuesday, August 03, 2004 3:30 PM > To: rosennej@qsm.co.il > Subject: Re: [hebrew] UTC - Holam proposals > > > This is not a valid or relevant argument. > > Encodings of the type we are trying to do are much much > younger than 1100 years, and relative to that, the problem is > significant to those who wish to encode the distinction which > is being made by some. > > Clearly, it is not significant to those who don't wish to > make the distinction. But then that's not the issue. > > K > > >>> "Jony Rosenne" 02/08/2004 11:44:24 pm >>> > Since the issue is 1100 years old, a few more months will not > be so significant, relatively speaking. > > Jony > > > -- > > Michael Everson * * Everson Typography * * http://www.evertype.com > > > > > > > > > > > From kfeuerherm@wlu.ca Tue Aug 3 09:51:40 2004 Received: with ECARTIS (v1.0.0; list hebrew); Tue, 03 Aug 2004 09:51:40 -0500 (CDT) Received: from wlu.ca (wluw5.wlu.ca [192.54.242.118]) by unicode.org (8.12.11/8.12.11) with SMTP id i73EpZua021376 for ; Tue, 3 Aug 2004 09:51:39 -0500 Received: from WLU-MTA by wlu.ca with Novell_GroupWise; Tue, 03 Aug 2004 10:51:35 -0400 Message-Id: X-Mailer: Novell GroupWise Internet Agent 6.5.2 Beta Date: Tue, 03 Aug 2004 10:51:07 -0400 From: "Karljurgen Feuerherm" To: Subject: [hebrew] Re: Holam background document Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 1937 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: kfeuerherm@wlu.ca Precedence: bulk X-list: hebrew Jony, I'm not trying to be inflammatory, and apologize if it appears that way. What I'm saying is that I haven't found what I've read to be convincing (and I admit I've not been able to read everything, I simply can't afford that much time) and that what I've felt, reading between the lines, comes across to me more as discomfort or dislike than substantial technical countering. I'm not suggesting that anyone is being bloody-minded--it is a natural and well-documented psychological phenomenon. So of course no one would say 'I dislike it,' because in all likelyhood he/she aren't aware of the degree to which that plays a part in the first place (including me). As for the existing data: the point of view seems to be in part that one should be taking a long-term view, and that if the existing data, which has been encoded over a few years, has to be translated or whatever to match a new encoding that will last many years, then that is a reasonable sacrifice. I'm not sure anyway that anything terrible will happen to existing data. It simply won't be able to do what the new data, using the new features, will be able to do. But if so, that's just life. K >>> "Jony Rosenne" 03/08/2004 10:14:01 am >>> > -----Original Message----- > From: hebrew-bounce@unicode.org > [mailto:hebrew-bounce@unicode.org] On Behalf Of Karljurgen Feuerherm > Sent: Tuesday, August 03, 2004 3:25 PM > To: peterkirk@qaya.org; tiro@tiro.com > Cc: everson@evertype.com; petercon@microsoft.com; > ken.whistler@sybase.com; hebrew@unicode.org > Subject: [hebrew] Re: Holam background document > > > > > >>> Peter Kirk 02/08/2004 6:49:30 pm >>> > On 02/08/2004 23:04, John Hudson wrote: > > > Peter Kirk wrote: ... > I do not accept this. So far, I am convinced that they do not > accept it because they do not like it. That is not the same > thing, and not a legitimate argument if that is in fact the > underlying reality. Karljurgen, I object to this. I don't recall anyone saying anything that could justify this remark. I for one, do not accept it because it a bad solution and wrong, and I had explained why. Also, it will create interchange problems not only with the existing data, which for some reason some people do not consider important, but also for new data. I consider the interchange of texts to one of the main aims of Unicode. Otherwise, we could just tell the users to use the PUA for any character they miss. Jony > > What needs to be demonstrated now is that it WON'T work. This is an unacceptable proposition. There are two alternatives, both will work, but one (HOLAM HASER for VAV) is bad, ill conceived and wrong, and the other a possibly acceptable compromise. > > And please don't bring up the existing data again. It will > continue to work as well as it has so far (not perfectly in > other words) and it has already been brought to our attention > that conservatism re: existing data is not a legitimate > argument. Fix involves change.... It isn't a fix, because nothing was broken, it is an additional distinction which was missing because there was no good solution. Now that the UTC has further defined the use of ZWNJ we have a reasonable proposal. Jony > > K > > > From kfeuerherm@wlu.ca Tue Aug 3 09:59:23 2004 Received: with ECARTIS (v1.0.0; list hebrew); Tue, 03 Aug 2004 09:59:23 -0500 (CDT) Received: from wlu.ca (wluw5.wlu.ca [192.54.242.118]) by unicode.org (8.12.11/8.12.11) with SMTP id i73ExJw2024708 for ; Tue, 3 Aug 2004 09:59:23 -0500 Received: from WLU-MTA by wlu.ca with Novell_GroupWise; Tue, 03 Aug 2004 10:59:19 -0400 Message-Id: X-Mailer: Novell GroupWise Internet Agent 6.5.2 Beta Date: Tue, 03 Aug 2004 10:59:01 -0400 From: "Karljurgen Feuerherm" To: Subject: [hebrew] Re: Holam background document Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Content-Disposition: inline X-archive-position: 1938 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: kfeuerherm@wlu.ca Precedence: bulk X-list: hebrew >>> Peter Kirk 03/08/2004 10:23:33 am >>> >Anyway, what is the sense of "guaranteed to work"? etc. I think John H has responded to this at length and better than I could hope to repeat. >>I agree. But what is the job which needs to be done? Clearly the >>majority of Hebrew users at least on this list don't accept that the I said: >I do not accept this. So far, I am convinced that they do not accept it >because they do not like it. That is not the same thing, and not a >legitimate argument if that is in fact the underlying reality. > > >They do not accept it primarily because it goes against their intuitions >about how the Hebrew script works. It goes against my intutions as well, despite the fact that I have been reading Hebrew for only 15 years and only sporadically and only in an academic environment. The point that Michael Everson is making (I think) is that from a TEXT stream point of view, it isn't about such things, or about analysis, or about phonemes, or any of that. It is about causing a text stream to function. And that, completely pragmatic issue, is what inclines me in that direction. Despite my INTENSE DISLIKE of that solution (see my other message about liking and disliking). Such intuitions, many from people who have been reading Hebrew since early childhood and have been immersed in Hebrew literature all their life, are likely to be far more reliable than impressions taken from a single book. The likelihood is that these intuitions represent the underlying reality. I said: >So long as it works, I will use it. And if everyone else wishes not to >use it on principle, then that's their problem. > >What needs to be demonstrated now is that it WON'T work. > > Peter responded: I'm not sure what "it" is here. The other solution which you were implicitly discussing, with a new character that we don't want. I said: >And please don't bring up the existing data again. It will continue to >work as well as it has so far (not perfectly in other words) and it has >already been brought to our attention that conservatism re: existing >data is not a legitimate argument. Fix involves change.... > > > Peter said: >The problem is that, whether we like it or not, people won't fix things which they don't perceive to be broken. That's fine. If they don't perceive it to be broken, then for them it is fine, isn't it? It will continue to behave as it always has, and suffer from the same limitations it always has. K From verdy_p@wanadoo.fr Tue Aug 3 10:10:30 2004 Received: with ECARTIS (v1.0.0; list hebrew); Tue, 03 Aug 2004 10:10:30 -0500 (CDT) Received: from mwinf0304.wanadoo.fr (smtp3.wanadoo.fr [193.252.22.28]) by unicode.org (8.12.11/8.12.11) with ESMTP id i73FAUdj027948 for ; Tue, 3 Aug 2004 10:10:30 -0500 Received: from VENGEROV (malakoff-1-82-67-109-84.fbx.proxad.net [82.67.109.84]) by mwinf0304.wanadoo.fr (SMTP Server) with ESMTP id 5B13018001A5; Tue, 3 Aug 2004 17:10:23 +0200 (CEST) Message-ID: <001f01c4796b$fecff680$6801a8c0@VENGEROV> From: "Philippe Verdy" To: Cc: "Hebrew List" References: Subject: [hebrew] Re: Holam background document Date: Tue, 3 Aug 2004 17:10:20 +0200 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1437 X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2800.1441 X-archive-position: 1939 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: verdy_p@wanadoo.fr Precedence: bulk X-list: hebrew Peter Kirk (alias "Karljurgen Feuerherm" ) wrote: > Everson and Shoulson's proposal is for one character both for the dot > in Holam Male and for Holam Haser with every base character except for > Vav, > and another character for Holam Haser only when the base character is > Vav. > Theoretically a mess as the distinction is arbitrary, based neither > on the glyph nor the semantics but only on the context (the base > character). Yes, if a new character must be introduced (should the solution with ZWJ/ZWNJ) to make such distinctions possible, this solution is better than the one requiring to change all holam dots in the frequent "holam male" case. Simply because everyone seems to agree that Holam Haser on Vav is very infrequent face to Holam male, and that is this case that is conflicting with the more frequent Holam Male case. Suppose we add U+XXXX "HEBREW POINT HOLAM HASER FOR VAV" (called HHFV below), then could we define a _compatibility_ decomposition for it, using the other proposed encoding with ZWJ ? Of course this will NOT be a canonical decomposition (no need for it, and it would not solve the problem with renderers that normalize their input...): Logically we would encode [vo] with , which is distinct from ['o] coded with (holam male), but that would have a _compatibility_ decomposition to (to help renderers that still don't know HHFV and may opt for this compatibility mapping in order to represent [vo] with , with a possible rendering fallback to identical to holam male ['o])... But, are there cases today in Unicode where a combining diacritic has a compatibility decomposition to a base character (ZWJ) and another diacritic? I know we have mappings from diacritics to other diacritics (for example some Greek double accents). From verdy_p@wanadoo.fr Tue Aug 3 10:40:19 2004 Received: with ECARTIS (v1.0.0; list hebrew); Tue, 03 Aug 2004 10:40:19 -0500 (CDT) Received: from mwinf0106.wanadoo.fr (smtp1.wanadoo.fr [193.252.22.30]) by unicode.org (8.12.11/8.12.11) with ESMTP id i73FeINl003048 for ; Tue, 3 Aug 2004 10:40:18 -0500 Received: from VENGEROV (malakoff-1-82-67-109-84.fbx.proxad.net [82.67.109.84]) by mwinf0106.wanadoo.fr (SMTP Server) with ESMTP id 2843A18001FD for ; Tue, 3 Aug 2004 17:40:12 +0200 (CEST) Message-ID: <002d01c47970$28f43e90$6801a8c0@VENGEROV> From: "Philippe Verdy" To: "Hebrew List" References: <001f01c4796b$fecff680$6801a8c0@VENGEROV> Subject: [hebrew] Re: Holam background document Date: Tue, 3 Aug 2004 17:40:09 +0200 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1437 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1441 X-archive-position: 1940 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: verdy_p@wanadoo.fr Precedence: bulk X-list: hebrew From: "Philippe Verdy" > Logically we would encode [vo] with , which is distinct from ['o] > coded with (holam male), but that would have a _compatibility_ > decomposition to (to help renderers that still don't know > HHFV and may opt for this compatibility mapping in order to represent [vo] > with , with a possible rendering fallback to > identical to holam male ['o])... As a side note, I think that logically, I would prefer a solution with ZWJ than with a new HHFV character: Holam male is the case where there's a sort of missing letter between Vav and holam, and that absorbs that Vav to make it silent (that's why I noted the phonetic with ['o] rather than simply [o]). The second reason is that the Holam point in Holam Haser after Vav, and in Holam male is still a holam with the same logical and graphical identity (what is really modified in Holam Male is the Vav). Another good alternative would be to encode a new modifier for Vav that makes it silent. This letter modifier would act like the DAGESH modifier on other consonnants. Suppose we call it HEBREW SILENT, and it would be a normal combining character, but with a combining class 0. Then we would encode that represents the sound [(v)o] where the parentheses surrounds a silent phoneme, where [(v)] is more explicit than just a simple quote in ['o] or nothing in [o]... As this feature is not specific to Hebrew but to many phonetic representations (for example the silent final letters in French or English, whose pronunciation is almost alwyas avoided unless there's a surrounding context that requires voicing them), may be this is related to other similar problems that were discussed some months ago about "missing implied letters" that are not explicitly marked but still logically and semantically there (the difference here is that the letter in Holam male is explicitly written but should be treated as if it was absent...) From peterkirk@qaya.org Tue Aug 3 14:42:48 2004 Received: with ECARTIS (v1.0.0; list hebrew); Tue, 03 Aug 2004 14:42:48 -0500 (CDT) Received: from pan.hu-pan.com (hu-pan.com [67.15.6.3]) by unicode.org (8.12.11/8.12.11) with ESMTP id i73JgYcU027132 for ; Tue, 3 Aug 2004 14:42:48 -0500 Received: from [213.162.124.237] (helo=[127.0.0.1]) by pan.hu-pan.com with asmtp (Exim 4.34) id 1Bs5B8-00033J-6M; Tue, 03 Aug 2004 20:42:30 +0100 Message-ID: <410FEAAD.5060908@qaya.org> Date: Tue, 03 Aug 2004 20:42:37 +0100 From: Peter Kirk User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7) Gecko/20040616 X-Accept-Language: en-gb, en, en-us, az, ru, tr, he, el, fr, de MIME-Version: 1.0 To: Karljurgen Feuerherm CC: hebrew@unicode.org Subject: [hebrew] Re: UTC - Holam proposals References: In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - pan.hu-pan.com X-AntiAbuse: Original Domain - unicode.org X-AntiAbuse: Originator/Caller UID/GID - [0 0] / [47 12] X-AntiAbuse: Sender Address Domain - qaya.org X-Source: X-Source-Args: X-Source-Dir: X-archive-position: 1941 X-ecartis-version: Ecartis v1.0.0 Sender: hebrew-bounce@unicode.org Errors-to: hebrew-bounce@unicode.org X-original-sender: peterkirk@qaya.org Precedence: bulk X-list: hebrew On 03/08/2004 15:45, Karljurgen Feuerherm wrote: > ... > >>The ZWNJ solution is acceptable to me and to many others, but if it is >> >> >not > > >>acceptable to the UTC we all have to see what can be done, but this >> >> >should > > >>not be used as the reason to choose a bad solution because it is the >> >> >only > > >>alternative on the table. >> >> > >I would be happy to live with this if I were convinced that it would >work as claimed; but a lot of discussion lately has suggested that it >may not. > > There have been a lot of suggestions that it will not work, but no proof at all. No one has been able to name a rendering system which supports Hebrew at all but does not support ZWNJ. There has just been talk about imaginary rendering systems which conform to Unicode but don't render as it recommends. On the contrary, I have proved that it does work, with one rendering system. I can hardly be expected to prove it with a whole lot of systems. When I have tried to push for details of what doesn't work, the best I have got is that my proposal works with certai