[65377] SUN 12/03/95 18:11 FROM BR.JMA "Joan Aliprand": UTC #66: Draft Minutes; 488 LINES Received: by rlg.stanford.edu; Sun, 3 Dec 95 18:11:33 PST Received: from rlg.Stanford.EDU by Unicode.ORG (NX5.67c/NX3.0M) id AA29145; Sun, 3 Dec 95 17:25:11 -0800 Message-Id: <9512040125.AA29145@Unicode.ORG> Date: Sun, 3 Dec 95 17:47:05 PST From: "Joan Aliprand" To: unicore@Unicode.ORG Subject: UTC #66: Draft Minutes Unicode Technical Committee Meeting #66 Draft Minutes Lotus Development Corporation, Cambridge, MA Friday, Sept. 29, 1995 Attendance: ----------- Member Representatives: Tim Greenwood, Digital Equipment Co. (timg@zk3.dec.com) Glenn Adams, Stonehand (glenn@stonehand.com) for Ecological Linguistics Mike Ksar, Hewlett-Packard (ksar@hpcea.ce.hp.com) John Gioia, IBM (gioia@vnet.ibm.com) Lisa Moore, IBM (lisam@vnet.ibm.com) Ed Batutis, Lotus (ebatutis@lotus.com) Murray Sargent, Microsoft (murrays@misrosoft.com) Lloyd Honomichl, Novell (lloyd.homomichl@novell.com) Joan Aliprand, RLG (br.jma@rlg.stanford.edu) - Chair Mark Davis, Taligent (mark_davis@taligent.com) Associate Members: Tex Texin, Progress Software (texin@progress.com) Ed Hart, SHARE, Inc./X3L2 (Edwin.Hart@jhuapl.edu) Observers: Mark H. David, Committee for Implementation of the Standardized Yiddish Orthography Smita Desai, FTP Software (smita@ftp.com) Mark Leisher, New Mexico State University (mleisher@crl.nmsu.edu) The meeting began with introductions. Adams presented a proxy from Ecological Linguistics. Leisher said that his purpose in attending as an observer was to determine whether New Mexico State University should join the Consortium as an Associate Member. Approval of Minutes of UTC #65 ------------------------------ Adams did not receive a copy of the draft minutes of UTC Meeting #65. Action Item #1 (Steve Greenfield): Make sure all Officers (including Technical Directors) are on the "minutes" distribution list. There were no amendments to the Minutes of Meeting #65. Motion to approve the Minutes of Meeting #65 Moved by Sargent, seconded by Honomichl. Vote was: 8 aye, 0 nay, 1 abstention (Digital) Greenwood abstained because he was not at Meeting #65. Report on Seventh International Unicode Conference -------------------------------------------------- Moore, who served as Chair of the Editorial Board of the Seventh International Unicode Conference, reported that the Conference was a great success, thanks to contributions from all the members. More than 50 volunteers assisted in various ways. The Conference was a sold out event, with 396 attendees (including speakers, etc.). Planning was for a maximum of 300 people; the 1994 Sixth Workshop had 237 attendees. At 7:45 am on the first day, there was a line to register; there were 65 walk-in registrations. The Conference was very highly rated. On a scale of 1 (lowest) to 5 (highest), the overall rating was 4.17, sessions were rated 4.16, audio-visual was rated 4.7. More than a quarter of the speakers were rated higher than 4.0. Sargent said that it is just as important to have a conference next year. Davis said that by employing a firm to handle logistics we open up the possibility of having conferences more often. Davis also raised the idea of an open call for papers for future conferences. Action Item #2 (Moore): Moore is to provide a report on the Conference for Encoding and the Web site. [Done] Adams suggested that the open call for papers be posted at the Web site, together with the Conference report. [Done] Hart asked whether the Conference included a tutorial. The Seventh Conference included a tutorial on globalization by Richard Ishida (which was the highest rated presentation). Ksar mentioned missing topics to be considered for future conferences, including in-depth tutorials on specific topics, introductory workshops, end-user case studies, Unicode in UNIX, and Unicode in IBM COBOL. Adams asked what was the proportion of papers resulting from the "call for papers" versus direct requests. Davis said that one feature of this Conference was that many more papers came from the "outside", from people not directly involved with the development of Unicode. Action Item #3 (Texin): Tex Texin is to solicit feedback from the exhibitors. Davis asked how the exhibits affected attendance. Moore replied that approximately 30-40 people attended only the exhibits. Cooperation between X3L2 and UTC -------------------------------- X3L2 and the UTC have held several co-located meetings in order to reduce time and travel costs. Sequential, co-located meetings have been possible because there is substantial overlap in the membership of the two organizations, and Unicode/10646 is of overriding interest to both. The terms of three of the X3L2 Officers expire in Spring. There is a problem of getting new officers because everyone involved in X3L2 has other responsibilities. Hart and Aliprand presented a proposal for X3L2 and the UTC to hold a joint meeting. To consolidate cooperation, the office holders of one group would also be officer holders in the other group. Aliprand has volunteered for Vice-Chair of X3L2, and Arnold Winkler has volunteered for Chair of X3L2; it would be appropriate to appoint Winkler as Vice-Chair of the UTC. Hart thanked Asmus Freytag for all his work on this proposal. Davis stated that it sounded like an efficient way to work, and that it is difficult to justify two separate meetings. Adams asked about X3 issues. Aliprand replied that Freytag had talked to Hart and Ksar in developing the initial proposal. Hart said that the draft proposal was sent to X3 for review, and X3 had made no objection. Ksar expressed concern that arrangements be settled before December 8 (the date of the next UTC meeting, which Hewlett-Packard is hosting), and pointed out that the joint meeting would need to follow X3 procedures (re notificiation, etc.). Appointment of Arnold Winkler as UTC Vice-Chair was approved by consensus, on the understanding that he is willing to serve. Discussion on distribution of X3L2 documents followed. Arnold Winkler has responsibility for mailings to X3L2 members. Hart said Winkler is willing to extend the mailing to all UTC member representatives and other regular UTC attendees who are not on the X3L2 mailing list. He will also include UTC-specific documents in the mailings as needed. Another point was raised: Should Associate Members be included in the mailings? There was then discussion on the possible impact of the proposal on X3L2 Membership. Hart stated that minimal membership dues are $600.00. X3 procedures require attendance at meetings. Greenwood and Sargent both saw the proposal as making it easier to attend meetings. Sargent raised the point that X3L2 is an U.S. body, whereas the Unicode Consortium (and so the UTC) is international. Will the proposal represent a problem for international members? Hart replied that X3 was concerned about this too, because participation in US TAG discussions is limited to US-domiciled companies. He said that it US TAG discussions are needed, there would be a separate, short meeting for this. Motion: The UTC accepts in principle the joint working relationship with X3L2, and agrees to hold joint meetings beginning in December 1995. The logistics are to be left to the chairs of both organizations. Moved by Ksar; seconded by Davis Unanimously approved. Action Item #4 (Hart): Hart will inform X3L2 members of the procedures for X3L2 proxies, which may assist in meeting attendance requirements. Version 2.0 of The Unicode Standard ----------------------------------- Davis led this part of the meeting. In order to speed up the progress of the book, there is now a group of volunteer authors and reviewers, who meet in the Bay Area every two weeks. Once there is a complete draft of Version 2.0, it will be distributed to Corporate and Associate Members for review. The review period will be four weeks (as previously agreed). The text of the book will reflect what we now know about how people approach Unicode. The text will be the original contents of volumes 1 and 2, Version 1.1 (UTR #4), and some new material (including UTC decisions since Version 1.1). The reviewers are Lisa Moore, Lloyd Honomichl, and Tim Greenwood; the authors are Mark Davis, Rick McGowan, Joe Becker, Asmus Freytag, Joan Aliprand, Ken Whistler, and Glenn Adams. Workload for authors and reviewers is one to two days of work every two weeks. [Lisa Moore subsequently took on responsibility for Chapter 6, and Mike Ksar joined the reviewers.] Chapter 3, "Formal Definition of the Unicode Standard," is an important part of Version 2.0, and it is being distributed for review in advance of the whole book. Davis explained that Chapter 3 arose out of work Adams did to formalize statements in various parts of Version 1.0. Chapter 2 of Version 2.0 is explanatory, and Chapter 3 contains the conformance piece from Chapter 2 of Version 1.0. Some terminology has been changed for clarity, but the content is based on Version 1.0. Action Item #5 (All Member Representatives): Send marked-up copies of Chapter 3 with corrections/comments to the Unicode Office: Unicode, Inc., P.O. Box 700519, San Jose, CA 95170-0519 Gioia asked if the authors were changing the whole issue of conformance? He also requested clarification on how this would relate to logo licensing. Davis responded that the authors used Version 1.0 as the basis for this chapter. As published, the definition of conformance was vague, e.g., it referred to "character semantics". Since that time, normative character semantics have been clarified. This statement of conformance should be nothing new. It does not affect logo licensing at all. Hart asked how the closing date of the ballot on UTF-16 related to the anticipated publishing date. Davis said that publication would be after the end of 1995. Ksar said that UTF-16 has passed from the SC2 point of view. Greenwood said that the chapter hit a good start, but then gets lost. Moore also commented on the general formality versus flow, and said that certain important things get lost. Gioia asked how closely some of the terminology/phrases are going to be tied together. Adams said that in his first draft, he tried to use 10646 terms as normative. Greenwood gave examples: abstract character; abstract character sequence follow by parallels for Unicode abstract character, Unicode abstract...Not as notes. Keep code values here equal to 16 bit or 32 bit. New definition for Unicode Codepoint. Define "unit of encoded text". Adams added that we should also define "CODE VALUE" independent of characters. Greenwood added that we need more formal, rigorous definition, or abandon pretense that this is formal. Ksar stated that he thought he would have problems, but didn't. There seems to have been more definitions than may have been necessary. Sargent brought up that when writing documentation for our programs, what should be said? Recommended "Unicode Point" in place of "Unicode Value". Texin said that it's an issue of "what is a character". Is this abstract or what? Davis stated was formal enough to provide precise definition vs. making comprehensible. Goal has to be balanced. Texin stated that the goals might be in conflict. There would need to be rigorous documentation. Hart asked if the Book would be much better if definitions wee split out from conformance. Notion of "character"-he prefers term "character". Greenwood stated the goal of balance is worst of both worlds. Write Chapter 3 as formal appendix. Have textual version of Chapter 3 with reference to appendix. Adams supported this idea. Ksar: Suggested a summary of what is required to conform. C1 and C2 need to be restated as "shall". General consensus was that the chapter should be reorganized to begin with statements for conformance, followed by definitions. Further discussion of Chapter 3 was deferred so that the UTC could discuss items from the agenda of the WG2 meeting in Tokyo (Nov. 6-10). WG2 Meeting ----------- Ksar outlined certain WG2 items: * Note action items in minutes. * pDAM-5-resolution of Korean comments. Reiterated position from UTC #65 to depreciate sooner than later. * pDAM-7: Hebrew cantillation marks, Vietnamese currency sign. * pDAM-8: CJK ideographs annex Davis asked whether pDAM-8 is essentially unchanged. Ksar replied that there were three minor editorial changes, and that Adams was involved in formulation of document. Motion to approve pDAM-6 (Tibetan), pDAM-7 (Miscellaneous characters including Hebrew cantillation marks), pDAM-8 (Annex on CJK ideographs). Moved Davis, seconded Unanimously approved. pDAM-5 -- Disposition of comments: ---------------------------------- UTC strongly in support of pDAM 5, feels it meets industry requirements. Re retention of old coding - all companies were consulted, overwhelmingly favored immediate deprecation. Microsoft promised mapping tables and they are overdue. Needed before end of October. Action Item #6 (Sargent): Sargent is to pursue Korean mapping tables with S.G. Hong. Hart said that immediate depreciation is solution, thought not favored by all WG2 representatives. Davis said that the points made by Japan had been debated in the UTC discussions. The issue of space for additional ideographs was raised. Adams said that the UTC should encourage IRG to continue examination of an ideographic composition method. IRG has done some work on prioritization. Davis said that industry need is for the 700 gaiji. Adams said that UTC needs to convey its priorities to IRG: to continue and complete current work, and to develop and ideographic composition method. Byzantine Music [N1208]: ------------------------ Consensus was that UTC is not in favor of encoding these characters in BMP. (This is a stronger statement than at UTC #65). Mongolian [N1226, N1248]: ------------------------- Two interested countries (China and Mongolia) plus Unicode Consortium. Mongolian wants to be considered primary arbiter. Ksar proposes encoding only base characters. There should be consistency with basic principles of ISO 10646, to encode characters not glyphs. Presentation forms for Mongolian should not be on BMP. Encourage Mongolians and Chinese to work with Unicode Consortium. Adams volunteered to communicate with them, but Aliprand reminded everyone that Freytag already had contacted them. Adams said that he would talk to Joe Becker. Discussion of whether communication should be only via Liaison statement to SC2, or directly in addition to Liaison method. Aliprand reported on behalf of Joe Becker. Becker now has all the WG2 documents and the Mongolian character set proposed by German implementers. There is a unique problem with Mongolian, as it presents an unusual relationship between language and script. The same glyph is used not only for different vowels, but these identical glyphs are different letters in the alphabet. Mongolian is allegedly syllabic, but its script is more like an alphabet. It is not always obvious what the units of the script are. There is a bigger set of constraints that the encoding must meet than just letter encoding because there are also contextual forms. In Joe's opinion, Mongolian is one script above all whose encoding should be based on an existing implementation. Becker proposes to undertake a comparative analysis of the various proposals. Iranian proposal [N1247]: ------------------------- Request for "pseudo space" and "pseudo connection". UTC position is that these characters are equivalent to the non-joiner and joiner. This needs to be explained to Iranians. Uighur [N1225]: --------------- Do not encode shapes. Ogham [N1246]: -------------- UTC consensus that this script should be encoded in an upper plane (accessible by UTF-16). New Symbols: ------------ Unicode Consortium is collecting symbols. UTC position is that only symbols that are used in text applications should be included. Adams asked about the status of various scripts. It was noted that the UTC needs to renew its efforts on outstanding scripts. Uniqueness of Names [N1231]: ---------------------------- The Canadian contribution to WG2 on unique identifiers and translatability of character names was discussed. Ksar said there is an SC22 proposal to use hex code positions as unique identifiers, which takes pressure off the issue of names of characters. UTC consensus: Agreement with SC22 position. UTC has already used this approach in U+ notation. The issue of unique identifiers in the UTF-16 context was raised. Davis said that what people want to identify is code values (e.g. to identify the full-width vs. the half-width katakana ka). to identify is code values: Motion: Use hex values as unique coded character identifiers. Moved by Davis, seconded by Adams Vote was: 8 aye, 0 nay, 2 abstentions Old Business: Math Operators ---------------------------- Sargent accepted leadership of math operator set project. Action Item #7 (Sargent): Sargent is to ask for volunteers to work on this project via posting to "unicore" dl. Adams pointed out that possible CJK ideographic composition may have Possible usage with math. Davis cautioned that this might be a slippery path, but there is a possible need for a minimal set of operators for legibility. Adams suggested that such operators could be considered for general punctuation block, although there could be script-specific ones, too. Ethiopic -------- Aliprand reported on Ethiopic at the request of Joe Becker. The UTC would like to see a proposal on Ethiopic for December UTC. The current work has augmented the original UTC proposal. For Ethiopic, there are basic and extended alphabets: the problem is where to draw the line. An Ethiopian at Indiana University is working on completing the extended repertoire. There is 90-95% agreement on the script and the encoding approach, but the need to focus on basic set and to defer discussion of problematic characters. This is Joe's preference. Becker is doing another round of consultation with expert informants, with the target of completion by the WG2 Tokyo meeting in November. The aim is to define the basic set; questionable rare characters should not be included. Additional Yiddish Letter ------------------------- Mark H. David representing the Committee for Implementation of the Standardized Yiddish Orthography, requested addition of precomposed yod with hirik, as this was the only letter of the standardized Yiddish orthography that is not encoded. Motion: The UTC accepts the proposed character, and considers that it should be encoded as a compatibility character. The UTC decision is to be communicated to WG2 via a liaison statement. Moved by Adams, seconded by Davis. Unanimously approved. Tracking Additions ------------------ Aliprand reported Becker's concern about "creeping augmentation" and the lack of a central source of information on additions. Davis said that the additions that the UTC has accepted, and that WG2 has approved, are likely to pass in the ISO ballot. When we do make additions, we need to keep track of the status of the additions. We could put information on the Web page. He suggested that a summary of actions taken at UTC Meetings to be added to Web page. Texin said that tracking of action items on Web page is good, but is not complete. We also need to know the revision level. Adams (who maintains the Web page) said that every document has date of last revision indicated on it. Davis said that the FTP site character database is Version 1.1. Extra characters are all this year, are "approved". With respect to possible future additions, Moore said that the Cherokee Nation might review proposal at end of October. Language Codes -------------- Various ideas about language codes had been discussed on "unicore" and the ISO10646 list serv. The discussion had been initiated by Mark Leisher. Davis said that the UTC has had a rule from the beginning. Changes to the Unicode Standard need an organizer and a concrete proposal. Adams said that he had recommended use of SGML language tags. The question that the UTC needs to answer is: Do we want to sanction language tags in Unicode plain text? Leisher said he is pushing use of SGML. However, the feeling among his user population is that SGML tools are not sophisticated enough. Davis moved to adjourn; Sargent seconded. The Chair thanked Lotus Corporation for hosting the meeting, and Ed Batutis for making the arrangements. Meeting ended at 4:55 p.m. To: UNICORE@UNICODE.ORG, OFFICERS@UNICODE.ORG