[72907] MON 07/01/96 11:09 FROM V-SGREEN@MICROSOFT.COM "Steve Greenfield
        (Unicode)": UTC #69 Minutes (Part 2); 730 LINES

Received: by rlg.org; Mon,  1 Jul 96 11:09:28 PDT
Received: from abash1.microsoft.com by Unicode.ORG (NX5.67c/NX3.0M)
        id AA04841; Mon, 1 Jul 96 10:29:20 -0700
Received: by abash1.microsoft.com with Microsoft Exchange (IMC 4.0.838.14)
        id <01BB6737.72695AF0@abash1.microsoft.com>; Mon,
  1 Jul 1996 10:24:12 -0700
Message-Id:
 <c=US%a=_%p=msft%l=RED-13-MSG-960701172216Z-23680@abash1.microsoft.com>
From: "Steve Greenfield (Unicode)" <v-sgreen@microsoft.com>
To: "'unicore@unicode.org'" <unicore@Unicode.ORG>
Cc: "'unicode-inc@unicode.org'" <unicode-inc@Unicode.ORG>,
        Mike Kernaghan
         <mikekern@microsoft.com>,
        'Joan Aliprand'
         <br.jma@rlg.org>
Subject: UTC #69 Minutes (Part 2)
Date: Mon, 1 Jul 1996 10:22:16 -0700
X-Mailer:  Microsoft Exchange Server Internet Mail Connector Version 4.0.838.14
Encoding: 710 TEXT

Here are Part 2 of the minutes to UTC #69.  If there are any areas that
require feedback, please let me know, and I will update my master copy.


The main representatives of Corporate members will receive a hard-copy
in the mail, too.   If anyone else would like a hard-copy mailed to
them, please send me a private e-mail with your address.

Sincerely,
Steve
unicode-inc@unicode.org
================================

Friday, June 7, 1996 meeting.
I.      Administrative Issues.
A.      UTC Membership Roll Call:
Corporate Members:
John Jenkins, Apple Computer, Inc. jenkins@apple.com (attended both
days)
Mike Ksar, Hewlett-Packard Company, ksar@hpcea.ce.hp.com (attended both
days)
Dr. V.S. Umamaheswaran, IBM Corporation, dorai@vnet.ibm.com (attended
both days)
Lisa Moore, IBM Corporation (Alternate), lisam@vnet.ibm.com (attended
both days)
Tatsuo Kobayashi, Justsystem Corporation,
tatsuo_kobayashi@justsystem.co.jp (attended 6/7 only)
Murray Sargent, Microsoft Corporation, murrays@microsoft.com (attended
both days)
Michel Suignard, Microsoft Corporation (Alternate),
michelsu@microsoft.com (attended both days)
F. Avery Bishop, Microsoft Corporation, averyb@microsoft.com (attended
6/7 only)
Rick McGowan, NeXT Software, Inc., rick_mcgowan@next.com (attended both
days)
Gary Roberts, NCR, gary.roberts@elsegundoca.ncr.com (attended both days)
Joan Aliprand, Research Libraries Group, Inc., br.jma@rlg.org
Glenn Adams, Spyglass, Inc., glenn@spyglass.com (attended both days)
Ken Whistler, Sybase, Inc., kenw@sybase.com (attended both days)
Arnold Winkler, Unisys (X3L2 Chair & UTC Vice Chair),
winkler@po3.bb.unisys.com (attended both days)
Associate Members:
Asmus Freytag, ASMUS, Inc., asmus_freytag@unicode.org (attended both
days)
Mark Crispin, University of Washington, mrc@cac.washington.edu,
(attended 6/7 only)
Bob Sandstrom, University of Washington, sandstro@cac.washington.edu,
(attended both days)
Officers of the Consortium:
Mark Davis, Taligent, Inc. (Unicode, Inc. President),
mark_davis@taligent.com (attended 6/7 only)
Mike Kernaghan, Microsoft Corporation (Unicode, Inc. Vice President),
mike_kernaghan@unicode.org (attended both days)
X3L2 Member companies represented:
Apple Computer; Hewlett-Packard; IBM; Microsoft; Research Libraries
Group; Unicode, Inc.; Unisys.

Full (Corporate) Member companies represented:
Apple, HP, IBM, Justsystem, Microsoft, NCR, NeXT, RLG, Spyglass, Sybase,
Unisys.

Full (Corporate) Member companies absent:
Digital Equipment, MGI Software, Novell, SGI

Meeting Started at 8:11 a.m.  Arnold Winkler to remain Chair of Joint
meeting.

III.    Specific Scripts (Includes interaction with WG2)
Mike Ksar requested that a set of procedures be established for the
acceptance of documents into any future meetings by X3L2 or the UTC.
This will minimize confusion on which documents are to be given a
number, while increasing effectiveness in the quality requirements of
documents that have been accepted.

Motion #69-6:  Documents to be distributed at the UTC should be
distributed through the Chair of the committee, and the Chair should
ensure that each distributed document contains at a minimum the
following information:
        1)  A document number;
        2)  The author's (or authors') name(s);
        3)  A date;
        4)  A title;
        5)  the action requested by the document.
Moved by Ken Whistler, seconded by Rick McGowan
Unanimous vote
Motion approved.

A.      Ethiopic
There was not enough time to discuss this item at the Copenhagen
meeting, so this item was bumped to the next meeting, to be hosted in
Quebec during August 96.  The repertoire has been accepted by the UTC.
The proposal is also stable enough for support to be gathered for
passage.  A pDAM should be prepared by Bruce Paterson for the Ethiopic
script.

Action Item #69-A13:  Glenn Adams
Glenn Adams is to talk to Bruce Paterson about Tibetan and find out if
there are any further information or changes in the Tibetan Proposal.

                        B.      Cherokee
Lisa Moore gave a brief review on the Cherokee Repertoire.  In her
dialogues with the Cherokee Nation of Oklahoma, she has asked them about
their feelings towards this latest proposal.  They (the Cherokee Nation
of Oklahoma) are very happy with the current ordering and are willing to
honor those that worked on the project.  Lisa Moore has not been in
contact with the other two tribes.  The names of the characters for
Cherokee in N 1172 all begin with "CHEROKEE LETTER ?" rather than
"CHEROKEE SYLLABLE ?"  There was some discussion about the advisability
of changing the names back to "CHEROKEE SYLLABLE ?" but no consensus in
favor of that change.  The Unique Naming Convention is met in either
case, and there is not a technical reason for the UTC to require a
change in the proposal on this basis.  The Cherokee Nation of Oklahoma
knows about this area of concern, and is ambivalent towards this change.

Motion #69-7:  The UTC accepts the proposed repertoire, ordering, and
names for the Cherokee Syllabary as specified in document N 1172 and
suggests encoding in the BMP starting at U+1500.
Adopted by Consensus.

Action Item #69-A14:  UTC
The UTC is to work with interested parties to create a final consensus
on the names list for the Cherokee Syllabary as the proposal progresses
to pDAM.

Action Item #69-A15:  Unicode Liaison to WG2
The Unicode Liaison to WG2 is to relay the position of the UTC as stated
in Motion #69-7 regarding encoding of the Cherokee Syllabary and should
suggest that WG2 progress the Cherokee proposal under the regular review
process of WG2.

C.      Mongolian
August meeting:  X3L2 is going to send a letter requesting an official
invitation.
Asmus Freytag would like to see the same thing from the UTC requesting
an official invitation.

Action Item #69-A16:  Steve Greenfield
Send Arnold Winkler letterhead and envelopes by June 10. for his letter
to the Mongolian delegation.

Action Item #69-A17:  Arnold Winkler
Draft a letter to the Mongolian delegation asking for an official
invitation to the August meeting.

D.      Canadian Aboriginal Syllabics (X3L2/96-063):
The names list was not included in document X3L2/96-062, but is
available from Rick McGowan.  Rick McGowan has reviewed X3L2/96-063, and
as of this time, he has some objections, but none that can't be worked
out.  He feels the naming convention has too much detail and is
inconsistent.  The names should be unique.  Documents are available in
English (-en) or French (-fr) at:
http://www.indigo.ie/egt/standards/casec/sl-pdam-en.html
http://www.indigo.ie/egt/standards/casec/sl-pdam-fr.html

Motion #69-8:  The UTC accepts the proposed repertoire for the Canadian
Aboriginal Syllabics (X3L2/96-063), including the representative glyphs
and character names (subject to possible shortening), as well as the
proposed encoding, for future coding in the Unicode Standard.  Text
shall be added to the Standard indicating that the Canadian Syllabic
Characters are not limited in scope to Canadian Aboriginal languages,
but are also used to represent languages elsewhere, such as in the US
and Russia.
Moved by Asmus Freytag, seconded Dr. V.S. Umamaheswaran.
Unanimous vote
Motion approved.

Motion #69-9:  That the UTC forward this decision on Aboriginal
Syllabics Encoding to WG2 and the Canadian National Standards Committee,
and recommends that the names be shortened by removing the word
character, or other means.
Moved by Asmus Freytag, seconded by Glenn Adams.
Unanimous vote
        Motion approved.

Action Item #69-A18:  Unicode Liaison to WG2
The Unicode Liaison is to communicate the UTC position to WG2 and that
they may process our proposal under their regular review process.

E.      Runic [X3L2/96-035 and X3L2/96-051]:
UTC does not wish to accept parenthetical naming conventions of the
Runic proposal and would like to see them simplified.  Any symbols in
this proposal that are in general usage must be placed in the symbols
mappings or unified.  Cross character can be unified, single punctuation
and double punctuation can also be unified.  Ogham, as far as WG2 has
characterized it, has started the precedent of being placed outside the
BMP due to being a dead script, and this is where Runic would be placed.
 John Jenkins favored encoding Runic outside the BMP because he could
show the IRG that scripts other than ideographs were being encoded
outside the BMP.  Ken Whistler and Rick McGowan had expressed some wish
to see this particular script on the BMP.  In summary, it was felt that
the proposal is OK; the repertoire & code points were accepted.

Motion #69-9:  The UTC accepts the Runic character repertoire from X3L2
96/035 for encoding in the Unicode Standard, subject to the following
conditions:
1)      the Runic Single Punctuation be unified with the middle dot U+00B7;
the Runic Multiple Punctuation be unified with the colon character
U+003A;
the Runic Cross be unified with the Maltese Cross U+2720.
2)      Parenthetical aliases in the names be eliminated;
3)      Encoding using the surrogate extension mechanism (outside the BMP).
Moved by Mark Davis, seconded by Asmus Freytag.
Vote:  10 in favor, 1 abstention (IBM)
Motion approved.

Action Item #69-A19:  Unicode Liaison to WG2
The Unicode Liaison is to take our position and relay it to WG2 and they
are to process our proposal under their regular review process.

F.      Burmese [X3L2/96-061]:
Lee Collins recently visited Myanmar to work on encoding of the Burmese
script.  The document [X3L2/96-061] parallels ISCII; the ordering is
wrong; "alphabetizing" (similar to the Latin alphabet being used to
alphabetize the Greek alphabet) this script creates a problem.  This is
not reflective of the current practice of Myanmar.  The names should be
native-ised.  This proposal has not taken into account the overall needs
of the script.  For example, the Shan community uses this script and
this proposal has not taken this into account.  There has been no
official backing by the government of Myanmar, which will make
implementing this proposal problematic.  Joan Aliprand went on record to
request the UTC obtain local support for a proposal from
Burmese-speaking people.  Rick McGowan and Glenn Adams also stated
strongly that the Consortium go on the record to further this end.  Lee
Collins is currently working towards this goal.

Brief History on Myanmar Language (courtesy of Mark Crispin):
The Myanmar Language and alphabets were first introduced in India before
Christ was known from 500 BC to 300 AD.  It originated from Byar Mahee.
The Byar Mahee language was first known between the 3rd and 5th
centuries BC.  It became popular in India during the reign of King
Arthawka.  Between the 1st and 8th centuries AD, the use of this
language, along with Indian culture and customs spread to Tibet,
Srilanka, Myanmar, Thai land, Cambodia and Indonesia.  The spread of
Byar Mahee to these lands helped in formation of the native language and
alphabets.

The writing of Byar Mahee spreaded to Myanmar.  The earliest writings in
Myanmar were found during the Pagan Dynasty between the 11th and 12th
centuries AD.  Basic Myanmar language was based on Byar Mahee and has
slowly developed to the present day Myanmar alphabets.

The Myanmar Language is made up of Consonants, Vowels, Punctuation and
Special Symbols.  A lot of Myanmar words came from Pali.

Motion #69-10:  The UTC strongly opposes the document X3L2/96-061 on
Burmese that is being submitted to WG2 because:
1)  Does not reflect actual usage of native speakers (of even Burmese)
in terms of ordering and names of characters.
2)  It does not account for the use of this script to write other
languages.
3)  It has not identified any native expert contributors.
4)  It attempts to force Burmese into an inappropriate correspondence
with ISCII (in the WG2 document referred to as Brahmic Unification
outlined by Hugh McGregor Ross in N1321.)
5)  The UTC has the policy of encoding length marks for representing
split vowels, such as in Tamil.  This policy should be continued.
6)  The names do not conform to the WG2 guidelines.
7)  Additional research must be completed before a mature proposal can
be prepared.  The UTC invites interested parties to join them in this
work.
8)  The proposal itself is not complete.
Moved by John Jenkins, seconded by Rick McGowan.
Unanimous vote
        Motion passed.

Action Item #69-A20:  Unicode Liaison to WG2
The Unicode Liaison is to take our position and relay it to WG2 and they
are to process our proposal under their regular review process.

G.      Sinhala [X3L2/96-062]:
The Nation Standards Body of Sri Lanka are working on standardizing
Sinhala and had a working standard which caused the Consortium to
withdraw their proposal.  The UTC strongly opposes the proposal
[document X3L2/96-062] for Sinhala.  It was not congruent with the
proposal from Sri Lanka.  Arguments offered against this proposal were
similar to those offered for Burmese.

Motion #69-11:  The UTC strongly opposes the document X3L2/96-062 on
Sinhalese that is being submitted to WG2 for reasons similar to those of
Burmese.  In addition, the proposal does not take into account the
proposed Sinhalese National Standard.
Moved by John Jenkins, seconded by Joan Aliprand
Unanimous vote
Motion passed.

Action Item #69-A21:  Unicode Liaison to WG2
The Unicode Liaison is to take our position and relay it to WG2 and they
are to process our proposal under their regular review process.

H.      Khmer [X3L2/96-060]
The UTC was contacted by Norbert Klein, a computer expert in Phnom Penh,
who wishes to work with the Consortium on Khmer and coordinate the work
in Cambodia.  Lee Collins has agreed to be the Unicode contact person
for this work.

Motion #69-12:  The UTC strongly opposes the document X3L2/96-060 that
is being submitted to WG2 for reasons similar to those of Burmese and
Sinhalese.
Moved by Glenn Adams, seconded by Joan Aliprand
Unanimous vote
        Motion passed.

Action Item #69-A22:  Unicode Liaison to WG2
The Unicode Liaison is to take our position and relay it to WG2 and they
are to process our proposal under their regular review process.

It was noted that Michael Everson's Khmer proposal [X3L2/96-060] still
had outstanding issues and questions.  These should be answered before
any proposal for the Khmer script is submitted to WG2.

I.      Additional Latin Characters [X3L2/96-053]:
                1.      N 1361 Proposal for addition of Latin character-Romania
[X3L2/96-053]
Ken Whistler pointed out the fact that these characters are already in
place in ISO-IEC 8859-2 Latin-2 and cross-mapped in many areas,
specifically for Romanian.  This could set a precedent if these four
characters are accepted.  Due to programs already using existing data,
this could create additional problems.

Motion #69-13:  The UTC acknowledges there is problem encoding Romanian.
 However existing implementations already use Latin-2.  Compatibility
with existing data is a major issue.  The UTC would like additional
evidence that these additions will not cause major data corruption.
Motion by John Jenkins, seconded by Joan Aliprand
Unanimous vote
        Motion passed.

Action Item #69-A23:  Unicode Liaison to WG2
The Unicode Liaison is to take our position and relay it to WG2 and they
are to process our proposal under their regular review process.

Discussion:
1)  These four characters are already encoded in the following
locations:
U+015E
U+015F
U+0162
U+0163
2)  S and T cedilla are already encoded in Latin-2 and are specifically
there for Romanian, and have been cross-mapped to 10646.  The proposed
characters are just glyphic variants as noted on page 180 of the Unicode
Standard, Version 1.0,
Volume 1.

J.      Addition of 4 Cyrillic characters for Macedonia [X3L2/96-059]:
This proposal contains characters used in a modern language, there are a
small number, they are not glyphic variants of other characters, and can
be added into the Cyrillic block with little problem, based on the
evidence provided by the Republic of Macedonia.

Motion #69-14:  After careful review of the evidence provided by the
document [X3L2/96-059] the UTC supports the inclusion of these
characters in the Cyrillic block.
Moved by Glenn Adams as amended, seconded by John Jenkins
Vote:  8 for, 2 against (Sybase & Next), 1 abstention (NCR)
        Motion passed.

Action Item #69-A24:  Unicode Liaison to WG2
The Unicode Liaison is to take our position and relay it to WG2 and they
are to process our proposal under their regular review process.

Action Item #69-A25:  Joan Aliprand
Joan Aliprand should contact the Republic of Macedonia with the results
of the UTC Meeting.

K.      Braille Proposal
Background.
Functionally, Braille is more similar to general symbols than to
letters, since the interpretation of the characters depends on the
language with which they are used. The set of symbols is well defined,
consisting of two parallel columns of dots, with each column having 3
dots (or 4 for extended Braille).  There are two variants of extended
Braille (depending on the baseline) but since they are never intermixed,
these variants can be safely unified.

Allocation and Appearance.
The characters are proposed for encoding at U+2800 to U+28FF.  Since the
characters are independent of language, the allocation can be
systematically arranged according to which dots are being used.  Since
there are 8 possible dots which can be either on (black/raised) or off
(blank), this results in 256 different symbols.  While some of these
symbols may not be used in any particular language, the full symbol set
can be encoded for use or extension by any language.

The representative glyph for each character is directly derived from the
character code, as follows. Number the dots in the following way:

------
|4  0|
|5  1|
|6  2|
|7  3|
------

Subtract 280016 from the character code.  If bit 0 is 1, then dot 0 is
black/raised; if bit 1 is 1, then dot 1 is black/raised; and so on. For
example, applying this to U+2852 results in the bits: 0101 0010, and
therefore has the following symbol:

------
|*   |
|   *|
|*   |
|    |
------

The glyph at U+2800 is blank, but since it should always have the same
width as the other symbols, it should not be unified with any of the
other space characters.

The proposed names of the characters are Braille Pattern 00 through
Braille Pattern FF.

Motion #69-15:  The UTC agrees to allocate 256 codepoints for Braille.
That these should be allocated in the Symbol Zone, starting at U+2800.
The name format will be "Braille Pattern 00" through "Braille Pattern
FF".  The representative glyphs will be composed of the dot pattern
corresponding to the name, with the dots numbered from top to bottom,
right to left, and a binary one representing a raised dot.  There are
four dots in each of the two columns.
Moved by John Jenkins, seconded by Dr. V.S. Umamaheswaran.
Unanimous vote
Motion passed.

Action Item #69-A26:  The Unicode Liaison to WG2
The Unicode Liaison is to take our position and relay it to WG2 and they
are to process our proposal under their regular review process.  The UTC
may adjust the numbers scheme based on its relationship with the
Japanese proposal.

L.      Armenian Proposal:
Rick McGowan gave a brief review of this proposal [X3L2/96-064].
Several of these characters are already coded within the Unicode
Standard, and the proposed name changes were without merit and were
viewed as unacceptable.  The proposal should be resubmitted at a later
date.

Action Item #69-A27:  Michel Suignard and the Unicode Liaison to WG2:
The Unicode Liaison should discuss this topic with Joe Becker for help.
Michel Suignard is to also contact Richard Youatt to find out about
Armenian Unicode Implementations.

IV.     Character Glyph Model [X3L2/96-037]
A.      Summary of voting on NWI [X3L2/96-042]
The Japanese have submitted comments through Mr. T.K. Sato and
recommended that the handling of ideographic characters be removed due
the Latin-centric tendencies of the model.

B.      Response to Japanese ballot comments [X3L2/96-057]
Ed Hart wrote a response with the help of Alan Griffee.  These documents
are numbered [X3L2/96-057] and [X3L2/96-058].

Glenn Adams pointed out the short-comings of the Japanese report, but he
also pointed out that Ed Hart and Alan Griffee need to have help with
this model.  Han Unification is a complex issue and should refer to the
latest annex from WG2.  CJK issues are very germane, and the model will
allow people to see the difference between a character and a glyph.

Action Item #69-A28:  Glenn Adams
Glenn Adams will take on the task of producing the text for the
Character Glyph Model with feedback from John Jenkins.

V.      Current Ballots:
A.      DISP 11186-2 [X3L2/96-048]
Results from previous ballots are available.  This document shows who
voted for and against a proposal, their comments, and which ballots.

        B.      pDAM #9:  Identifiers of characters [X3L2/96-047 and X3L2/96-048]

        C.      Reports on ISO Ballots
                1.      ISO 8859
                2.      ISO 1073
                3.      Cherokee
                4.      Yiddish

D.      CHANGE OF VOTE
Ken Whistler, the Sybase representative, announced that Sybase is
satisfied that the conditions in its vote on Version 2.0 have been met,
so its NO vote changes to YES.

VI.     Internet Documents
A.      Francois Yergeau's RFC draft for UTF-8 (Hart)
See document,
ftp://ds.internic.net/internet-drafts/draft-yergeau-utf8-00.txt
for further details.

Action Item #69-A29:  Gary Roberts
Contact Francois Yergeau for a clarification on minimum character values
for UTF-8 representations.

B.      Internationalization of HTML
The Draft of the HTML Proposal has been closed.  The UTC would like to
have it noted that there needs to be better communications between
Unicode Consortium and IETF concerning character set issues.

Glenn gave an overview on HTML and the members asked questions.  Mark
sent out a survey to Unicore and of the four he talked to, none
implemented any of the algorithms.

Action Items #69-A30:  Mark Davis
Produce a letter from the Consortium to HTML working group expressing
the need for better communication between our groups.

Action Item #69-A31:  Glenn Adams
Provide the names and address of contacts within the HTML working group
for Mark Davis' letter.

Motion #69-16:  The Unicode Technical Committee has a requirement that
the BIDI codes be mappable 1:1 to tag and attribute combinations in
HTML.
Moved by John Jenkins, seconded by Dr. V.S. Umamaheswaran.
Vote:  11 yes, 0 no, 1 Abstention (Spyglass)
Motion passed

Action Item #69-A32:  Glenn Adams, Mark Davis
Glenn Adams and Mark Davis are to provide specific cases illustrating
the motion (requirement that the BIDI codes be mappable 1:1 to tag and
attribute combinations in HTML.)

VII.    Reports
        A.      International Unicode Conference (Moore)
Nothing new to report at this time.

        B.      The Unicode Standard, Version 2.0 (Davis)
Nothing new to report at this time.

VIII.   Old Business
        A.      Arabic characters:  revised proposal (Aliprand)
Nothing to report at this time.

IX.     Other Business
        A.      Handling obsolete codes
This issue has been addressed by minor textual changes in Version 2.0.
The change is generic (not specific to Tibetan).

        B.      Simple Unicode text compression (Davis)
Misha Wolf, of Reuters, Ltd., gave a talk at IUC #8 on Reuters' method
for fast and efficient compression.  1 or 2-byte codes with
multi-windowed method.  It covers all of the Unicode Standard.

Action Item #69-A33:  All UTC Participants
Members are requested to study Misha Wolf's paper from IUC #8 and be
prepared to discuss it at UTC #70 in September.

Action Item #69-A34:  Mark Davis
Make a Unicode Technical Report (UTR) draft of Misha Wolf's paper on
text compression.  If possible, he is to see if Misha Wolf's paper can
be distributed electronically.

C.      Behavior of NSM after control/format characters (Davis)
What happens if there is an "A" & a "^" joined by a format code.  An
example was already addressed in Version 1.0, i.e.:  in the middle of a
paragraph return where a combining character occurs.  Dean Abramson
would like to see a format character that can be added between the base
character from the nonspacing mark.  Mark would like to see the UTC
define what is acceptable behavior in the scenario.
Base characters
formatting characters
NSM-Combining characters
Separation
ORC-substitution character

Motion #69-17:  If there is a combining character sequence which
includes format or control characters, an implementation is not required
to treat that sequence as canonically equivalently to the same sequence
of combining characters without those format or control characters.
Implementations may treat the combining characters that occur after the
format or control characters as isolated.
Moved by Lisa Moore, seconded by Arnold Winkler.
Unanimous vote
Motion passed.

        D.      East Asian SC Report
Adding new ideographs vertically, and filling in the holes horizontally.
 These characters are very rarely used.  There are approximately 6600
characters that have the authors looking at coding them in the BMP.  The
set is unified and legitimate.  The only thing that sets them apart is
their rarity.  Most are Chinese, with approximately 300 submitted by
Japan.  Each character individually had to be justified for their
eventual inclusion to this proposal.  There were checks and balances put
in place and observed.  It is more important to get a full set of
ideographs, than to get a frequently used set into the BMP.  John
Jenkins recommended that the UTC accept the repertoire. Vertical
extension will add more character variation that can then go into the
BMP.  These characters are extremely rare and should not be
added--period.  This would also add a Horizontal column while "filling
in the gaps" that occur through the URO with other characters.  The
repertoire, such as it is, that has been forwarded by the IRG committee.
 Discussion centered around placement of vertical extension.  There were
three groups, Group 1 consisting of 6600 (mostly from GB5, GB3, GBK)
characters; Group 2 (Unknown); Group 3 (unknown).

Vertical Extensions - adding new ideographics
Horizontal Extensions - add V column and add in missing forms (note:
Conflicts with national rejections.)

Vertical extensions cannot be unified with URO characters.  Asked each
country for its highest priority characters.  Have been mutually
compared and have been compared to URO.  Prof. Lancaster (UCB); 15,000
glyphs unable to find in Unicode.  The largest Kanji collection is
Morohashi which includes 50,000 characters.  Respected Chinese
dictionary has 80,000 unless those are included, the Unicode Standard
will continue to receive pressure for additional characters.

Motion #69-18:  The UTC accepts the repertoire of the vertical extension
as such but only to encode within the BMP.
Moved by Rick McGowan, seconded by Mike Ksar
Vote:  10 yes, 0 no, 1 abstain
Motion passed.

Action Item #69-A35:  Unicode Liaison to IRG
The Unicode Liaison is to take our position and relay it to the IRG.

        E.      Kobayashi proposal for CJK ideographs
There was a discussion among members concerning Tatsuo Kobayashi's
proposal. The document "Proposal Re:  CJK ideographs" (Date:  26 January
1996 Author:  Tatsuo Kobayashi) was discussed.

Action Item #69-A36:  John Jenkins
Forward Kobayashi's paper, with comments from the UTC, to the IRG.

Action Item #69-A37:  John Jenkins
Start a discussion on the Unicore DL regarding this proposal.

        F.      Sorting Standards (Winkler)
Committee/Document      Title   Editor  Status
SC22/WG20 N 453 CD 14651:  Intl. String Ordering        Alain LaBonte   submitted
for registration in JTC1
CEN/TC304 WG1 N506      Sorting order of the extended European subset of
10646   M. Everson      draft for TC304
CEN/TC304 WG1 N506      Sorting order of the extended European subset of
10646   M. Everson      draft for TC304
ISO TC37 SC2/WG3 N65    CD 12199:  Alphabetical ordering of multilingual
terminology and lexographical data represented in the Latin
alphabet        Havard Hjulstrad        submitted for CD ballot (only for Latin)
ANSI/NISO Z39.75 - 199x Alphabetic arrangement of letters and the
sorting of numerals and other symbols   Hans Wellisch   undefined, draft
available, mainly bibliographic application

The UTC objects to the non-handling of combining characters.  Chair to
talk to WG20 convenor concerning implementation of level 1 is only
applicable to traditional programming languages and needs to be
revisited.  Work on 14651 makes a conformant implementation of Unicode
impossible.  This would make 10646 and the Unicode Standard
desynchronized.  Combining sequences must be equal to their canonical
equivalencies.
Problems occur in sorting double-level characters.  Over-definition and
limiting scope than charter states.  Further discussion needed.

X.      Other Items
A.      Codes for scripts - proposal M. Everson
See this document for further information:
http://www.indigo.ie/egt/standards/scriptscodes-en.html
It proposes 2-letter codes for representation of names for scripts.

B.      ISO/TC46/SC2 Transliteration standard
John Clews chairs this group and sent out an invitation to the Unicode
DL.  This group would be discussing conversion of written languages.  If
participants are interested in joining the list, contact
"tc46sc@elot.gr". (check the document for information on e-mail.)

C.      Clarification of source code separation rules (Hart)
This document was sent out by Ed Hart.

D.      Addition of QUAD
Motion #69-19:  The UTC accepts the APL Quad character name and glyph.
The code point is to be in the Symbols block, assigned in consultation
with WG2.
        Moved by Dr. V.S. Umamaheswaran, seconded by Gary Roberts.
                Vote:  8 yes, 2 against (NeXT, Sybase), 0 abstained
        Motion passed.

E.      Properties issue in COBOL
ANSI and ISO COBOL would like to reference Unicode case-mapping property
table.  Would like it to be language-independent.  Asmus Freytag
confirmed that Turkish I would be the only exception.  He also stated
that he was in favor of this item.  Joan Aliprand also stated that it
would be consistent with one of the Unicode Board's directives, to have
the Unicode Standard referenced by other standards.

Action Item #69-A38:  Arnold Winkler
Distribute in X3L2 mailing current status of JTC1 document on use of
publicly available specifications.

Action Item #69-A39:  Arnold Winkler
Send advance copy of this document to Lisa Moore.

F.      UTF-7
Problem in IETF; maintain source information about codeset used to allow
round-trip mapping.

Action Item #69-A40:  (?)
Communicate problem to David Goldsmith.

G.      Support the printing of 10646
To help with staying in synch with ISO 10646, helping WG2 with charts
and tables will give us the lead when we want to next republish.  Mark
Davis was highly in favor of his measure.  He was also willing to put in
some effort towards seeing it happen.  There were several contributions
towards this topic.

Action Item #69-A41:  Mark Davis
Provide samples of the charts and tables that are being published in the
Unicode Standard, Version 2.0 to Mike Ksar for evaluation in WG2.

Action Item #69-A42:  Glenn Adams
The Unicode Liaison is to convey our offer to print charts to WG2.

The chair thanked Microsoft for providing facilities and food.  The
meeting adjourned by consensus.
