L2/12-344

Title:  WG2 Consent Docket

Author: Ken Whistler

Date:   November 1, 2012

Action: For consideration by UTC 


WG2 #60 met in Chiang Mai, Thailand, the week of October 22 - 26, 2012.
During that meeting a number of resolutions were taken which progressed
Amendments 1 and 2 to 10646 3rd Edition and which started the process
for a CD for 10646 4th Edition. See L2/12-340 (= WG2 N4354) for the
full details of all the resolutions.

As usual, in this consent docket, I summarize just the parts of the
actions taken by WG2 which result in a different status between WG2
and the UTC regarding various character approvals. These are the
differences where the UTC needs to make some decision regarding how
to synchronize approvals (or to oppose a proposed change).

For convenience, the changes are grouped here by amendment (or CD).

=======================================================================

Changes Related to Amendment 1

Amendment 1 is now being progressed to FDAM status. The FDAM ballot is
a non-technical ballot, so at this point it is too late to make further
technical changes on its content. I recommend that the UTC simply approve
the few points where it is out of synch with former UTC approvals.
However, see below for issues regarding decompositions.

The full listing of the revised Amendment 1 content for FDAM balloting,
including a number of glyph changes, can be seen in WG2 N4381 (= L2/12-3xx).

A1. Three Latin Character Name Changes

Three characters in the additions for Teuthonista had their names changed
as part of the disposition of ballot comments. Specifically, these are:

AB53 LATIN SMALL LETTER STRETCHED X 
 --> LATIN SMALL LETTER CHI
AB54 LATIN SMALL LETTER STRETCHED X WITH LOW RIGHT RING
 --> LATIN SMALL LETTER CHI WITH LOW RIGHT RING
AB55 LATIN SMALL LETTER STRETCHED X WITH LOW LEFT SERIF
 --> LATIN SMALL LETTER CHI WITH LOW LEFT SERIF
 
Effectively, this was a replacement of "STRETCHED X" by "CHI", based on
the evidence presented about the use and development of this convention
in Teuthonista.

Recommendation: Approve these three name changes.

A2. Decomposition Issues in Amendment 1

WG2 doesn't formally decide on decompositions for repertoire in amendments
to 10646. That is the purview of the normalization algorithm and the UTC.
However, because of the way the amendments and standard are printed now,
decompositions entered into the names list (or not entered into the
names list, as the case may be) appear visibly in the balloted documents.
By my accounting, there are two specific anomalies in the Amendment 1
names list that the UTC needs to take a stand on, so they can be corrected,
if necessary, before Amendment 1 is published.

First, there are two compatibility decompositions listed amongst the
Duployan characters:

1BC5A DUPLOYAN LETTER OW
        # <medial> 1BC56
1BC5B DUPLOYAN LETTER OU
        # <initial, final> 1BC5A
        
I don't think these make any sense (and the second one isn't even syntactically
correct), and should be removed from the Amendment.

Second, there are four Cyrillic modifier letters which should have <super>
decompositions added, for consistency with the way we define similar modifier
letters elsewhere in the standard.

AB5C MODIFIER LETTER SMALL HENG
AB5D MODIFIER LETTER SMALL L WITH INVERTED LAZY S
AB5E MODIFIER LETTER SMALL L WITH MIDDLE TILDE
AB5F MODIFIER LETTER SMALL U WITH LEFT HOOK

These should be corrected to:

AB5C MODIFIER LETTER SMALL HENG
        # <super> A727
AB5D MODIFIER LETTER SMALL L WITH INVERTED LAZY S
        # <super> AB37
AB5E MODIFIER LETTER SMALL L WITH MIDDLE TILDE
        # <super> 026B
AB5F MODIFIER LETTER SMALL U WITH LEFT HOOK
        # <super> AB52

These names list corrections are editorial in status as far as the Amendment text
goes, but are normative (and immutable) from the point of view of the
publication of Unicode 7.0. So I think they should be caught and corrected
ASAP. I'm not sure what the original proposals had in these cases without further
research in the documents, but these two cases clearly represent oversights in
the review of the Amendment 1 ballot text.

Recommendation #1: Explicitly disavow the Duployan decompositions.

Recommendation #2: Approve the four listed decompositions for AB5C..AB5F.
 

===========================================================================

Changes Related to Amendment 2

Amendment 2 is now being progressed to DAM status. The DAM ballot is a
technical ballot (enquiry stage), so the UTC has more flexibility, in general,
in responding to these changes. In most cases, however, the changes received
easy consensus in WG2, and I don't see any particular barriers to UTC
approval at this stage.

The full listing of the revised Amendment 2 content for DAM balloting,
including a number of glyph changes, can be seen in WG2 N4380 (= L2/12-3xx).


B. Old Hungarian

Old Hungarian was the topic of extended discussion during WG2. There is a
separate ad hoc report detailing that discussion and recommendations from
the meeting. (See L2/12-334 = WG2 N4374.) The upshot of the discussion was
that WG2 decided to leave Old Hungarian in Amendment 2, but to change the
name of the script and block (and the script designator for character
names) from "Old Hungarian" to just "Hungarian". This removed the "Old"
part of the name, which seemed to be the most objectional issue for some
of the folks objecting to the encoding. There was no consensus in WG2 to
switch to the name "Rovas" or variants similar to that. There were also no
changes made to the inventory of characters.

Recommendation #1: To stay in synch with the current ballot contents, I
recommend that the UTC reapprove the script with the change of name
for the script and block to "Hungarian", and with the updated character
names as listed in WG2 N4380. (This, by the way, will also put the UTC
back in synch regarding a few other character name changes for this
script from the prior WG2 meeting.)

Recommendation #2: Because the progression of the Hungarian script in
DAM 2 continues without U+10CFE OLD HUNGARIAN NUMBER FIVE HUNDRED (removed
in a prior ballot), I recommend that the UTC rescind its approval of
that character.

C. Pahawh Hmong Clan Logographs

The UTC had deferred approval of 18 Pahawh Hmong clan logographs, pending
receipt of more information regarding their status. More information was
forthcoming, and WG2 decided to rearrange the 18 logographs in the ballot
and add a 19th. At this point, with more evidence, I think these are
ready for approval.

Recommendation: Approve 19 Pahawh Hmong clan logographs in the range 16B7D..16B8F.
(See WG2 N4380 for details.)

D. Mende Numbers

After a long discussion in WG2, an acceptable compromise was reached on
how to represent the Mende numbers. The upshot was to accept the nine
digits that everyone agreed about already, but then to add 7 number
bases as combining marks below, for a total of 16 characters to represent
the numbers. The blocks were also adjusted, as after the compromise there
was no need for a separate Mende numbers block.

Recommendation:
  a. Rescind the UTC approval of the Mende Numbers block (1E8D0..1E8EF)
  b. Extend the Mende block from 1E8CF to 1E8DF.
  c. Move the 9 Mende digits already approved by the UTC from 1E8D1..1E8D9
     to 1E8C7..1E8CF.
  d. Approve the addition of 7 combining Mende number bases at 1E8D0..1E8D6.
     (See WG2 N4380 for details.)
     
E. MARK SIGN

WG2 decided, on the basis of feedback from the German NB, to change the
name of U+20BB MARK SIGN to NORDIC MARK SIGN.

Recommendation: Accept this name change.

F. Pictographic Symbols (Webdings and Wingdings)

There were further substantive and glyphic changes made to the various
webdings (and other pictographic symbols) in the Miscellaneous Symbols
and Pictographs block (U+1F300..U+1F5FF) and in the Transport and Map Symbols
block (U+1F680..U+1F6FF). These changes involved some code points moved,
some name changes, some glyph changes, and the addition of ten more
characters. Although people may have further comments to make or object
to particular changes or additions, the cleanest way to manage
the bookkeeping for the changes in a repertoire like this is just to
first go on record as approving the revised repertoire and glyphs as
documented in the ballot.

Recommendation: Accept the revised repertoire, character names, and
glyphs in the Miscellaneous Symbols and Pictographs (count = 76) and Transport and
Map Symbols (count = 21) blocks, as documented in WG2 N4380.

===========================================================================

Changes Related to the CD for the 4th Edition

At the recommendation of the project editor, WG2 decided to handle further
new additions to 10646 in the context of a new project subdivision for
a CD for the 4th Edition, rather than trying to tack them onto Amendment 2.
Some of the additions represent small repertoire sets already approved
by the UTC, so I will skip over those. However, there were a number of
other additions approved for the CD that the UTC has not yet approved.

G. Latin Letter Additions

WG2 approved 6 more Latin letters:

A7B2 LATIN CAPITAL LETTER J WITH CROSSED-TAIL
A7B3 LATIN CAPITAL LETTER CHI
A7B4 LATIN CAPITAL LETTER BETA
A7B5 LATIN SMALL LETTER BETA
A7B6 LATIN CAPITAL LETTER OMEGA
A7B7 LATIN SMALL LETTER OMEGA

A7B2 is an orthographic uppercase pair for an African language. (WG2 N4332)
The other five the UTC has seen and discussed from L2/12-270 (= WG2 N4297).
The small letter chi is the uppercase pair for the renamed LATIN SMALL LETTER CHI
(see item A above). The betas and omegas are the other characters from that
document which WG2 reached consensus on. The remaining 3 characters in
L2/12-270 (tau gallicum and the uppercase small cap i) were more controversial,
and were not approved for ballot, pending more evidence and discussion.

Recommendation: Approve these 6 Latin character additions.

H. Devanagari Sign Siddham

WG2 approved:

A8FC DEVANAGARI SIGN SIDDHAM

See L2/12-123R (= WG2 N4260)

Recommendation: Approve this character addition.

I. Sharada Character Additions

WG2 approved 6 more Sharada characters:

111C9 SHARADA SANDHI MARK
111CE SHARADA CONTINUATION SIGN
111DB SHARADA HEADSTROKE
111DC SHARADA SIGN SIDDHAM
111DD SHARADA SECTION MARK-1
111DE SHARADA SECTION MARK-2

See WG2 N4330, N4329, N4337, N4331, and N4338.

Recommendation: Approve these 6 Sharada character additions.

J. East-Slavic Musical Symbols

WG2 approved 11 East-Slavic musical symbols. This was based on a revision
of the document on this topic which the UTC saw and discussed at some length
already. (WG2 N4362) The characters are:

1D1DE MUSICAL SYMBOL KIEVAN C CLEF
...
1D1E8 MUSICAL SYMBOL KIEVAN FLAT SIGN 

See WG2 N4362 for the full list and glyphs.

Recommendation: Approve these 11 musical symbols.

K. Ahom

WG2 approved the Ahom script, based on WG2 N4321.

Recommendation: Approve 57 Ahom characters in the range 11700..1173F, in
the Ahom block, also 11700..1173F.

L. Anatolian Hieroglyphs

WG2 approved the Anatolian hieroglyphs, based on WG2 N4282. There were some
matters of contention in the script, particularly as regards the use of
three canonical decompositions in the set. However, the cleanest way forward
is to get the approval of the entire repertoire on record, and then focus
on possible ballot comments regarding the few problematical issues in
the repertoire.

Recommendation: Approve 583 Anatolian hieroglyphic characters in the
range 14400..11646, in the Anatolian Hieroglyphs block, 14400..1467F.

M. Multani

WG2 approved the Multani script, based on WG2 N4159.

Recommendation: Approve 38 Multani characters in the range 11280..112A9, in
the Multani block, 11280..112AF.

N. Hatran

WG2 approved the Hatran script, based on WG2 N4324.

Recommendation: Approve 30 Hatran characters in the range 108E0..108FF, in
the Hatran block, also 108E0..108FF.

O. CJK Extension E

This is a *big* one. WG2 approved the next big extension of CJK Unified Ideographs,
based on WG2 N4358. This will require careful review, of course. The UTC can
decide whether to approve now or to wait until our CJK experts have another
go at the table. There are 5768 characters in total, in the range 2B820..2CEA7,
in the CJK Unified Ideographs Extension E, 2B820..2CEAF.

Recommendation: Discuss and decide what to do.