L2/13-154

Title:  WG2 Consent Docket

Author: Ken Whistler

Date:   July 22, 2013

Action: For consideration by UTC 


WG2 #61 met in Vilnius, Lithuania, the week of June 10 - 14, 2013.
During that meeting a number of resolutions were taken which progressed
Amendment2 to 10646 3rd Edition and which also progressed
the CD for 10646 4th Edition. See L2/13-144 (= WG2 N4404) for the
full details of all the resolutions.

As usual, in this consent docket, I summarize just the parts of the
actions taken by WG2 which result in a different status between WG2
and the UTC regarding various character approvals. These are the
differences where the UTC needs to make some decision regarding how
to synchronize approvals (or to oppose a proposed change).

For convenience, the changes are grouped here by amendment (or CD).

Note that the pipeline page:

http://www.unicode.org/alloc/Pipeline.html

has already been updated to reflect changes in approvals by WG2, and
to highlight differences from the current approvals by the UTC, so
that page can be useful in following the discussion below on the
individual issues.

=======================================================================

Changes Related to Amendment 2

Amendment 2 is now being progressed to FDAM status. The FDAM ballot is
a non-technical ballot, so at this point it is too late to make further
technical changes on its content. I recommend that the UTC simply approve
the few points where it is out of synch with former UTC approvals.
Note that with the additions from Amendment 2, the anticipated repertoire
of Unicode 7.0 is now complete.

The full listing of the revised Amendment 2 content for FDAM balloting,
including a number of glyph changes, can be seen in WG2 N4458 (= L2/13-150).

A. MANAT SIGN

WG2 approved the addition of U+20BC MANAT SIGN. The UTC has seen this
character before, but held off on approval waiting for further evidence of
use. WG2 saw that additional evidence, and decided to accelerate the encoding
into Amendment 2, to avoid having to do a hurry up publication of a version
including just the currency sign addition. The relevant document is WG2 N4445.

Recommendation:

The UTC should approve U+20BC MANAT SIGN for Unicode 7.0.

B. Move and rename of two phonetic characters

WG2 decided to move the code points for two characters in ballot, and
renamed one to avoid a name inconsistency pointed out by the German NB.
The net change was from:

U+A7AE LATIN SMALL LETTER INVERTED ALPHA
U+A7AF LATIN LETTER SMALL CAPITAL OMEGA

to:

U+AB64 LATIN SMALL LETTER INVERTED ALPHA
U+AB65 GREEK LETTER SMALL CAPITAL OMEGA

The name change from "LATIN" to "GREEK" for the small capital omega was
to avoid a shape collision problem for the Latin capital omega. It also
followed precedent for some other Greek letters used for phonetic
transcription.

Recommendation:

The UTC should approve the two code point changes and one name change
for Unicode 7.0.

C. Name changes for Siddham punctuation

WG2 changed the names of two Siddham punctuation marks, from:

U+115C4 SIDDHAM SEPARATOR-1
U+115C5 SIDDHAM SEPARATOR-2

to:

U+115C4 SIDDHAM SEPARATOR DOT
U=115C5 SIDDHAM SEPARATOR BAR

Recommendation:

The UTC should approve the two name changes for Unicode 7.0.

D. Pahawh Hmong clan logographs

WG2 approved 19 Pahawh Hmong clan logographs which the UTC has seen, but not
yet accepted. The characters are in the range U+16B7D..U+16B8F. The
complete list of character code points, names, and glyphs can be
seen in L2/13-150 in the Pahawh Hmong block.

Recommendation:

The UTC should approve the addition of these 19 clan logographs for Unicode 7.0.

E. Mende --> Kikakui --> Mende Kikakui

The block which originally had been approved by WG2 as "Mende" was changed
to "Mende Kikakui". The UTC had previously approved a name change from "Mende"
to "Kikakui" and requested that that name be used in Amendment 2. The name
change to "Mende Kikakui" was a compromise, responding to a comment from
Ireland and noting the similar precedent for the name of the Bassa Vah script.

Recommendation:

The UTC should approve the block name change for Kikakui to Mende Kikakui
(U+1E800..U+1E8DF), and the corresponding change for the names of all
characters in that block, for Unicode 7.0.

F. Syntax change for Ideograph Description Sequences

There was an extensive discussion in WG2 regarding Japanese NB comments about
ideographic description sequences. The main objection to the use of PUA
characters with ideographic description sequences -- a change to the IDS
syntax that the UTC had already agreed to earlier -- turned out to be relatively
easy to accomodate, once it became clear that specifying a substitution
character for an otherwise unencoded component would satisfy Japan's objection.
As a result, after discussion, everyone concluded that use of U+FF1F FULLWIDTH
QUESTION MARK was the perfect candidate to add to the syntax to indicate
the presence of an unencoded component. Annex I.1 in 10646 will be adjusted
to add a bullet describing the use of U+FF1F in an IDS to indicate to indicate
an 'undescribed component'.

The corresponding change for the Unicode Standard, to keep the syntax of IDS
in synch, would be to add "| U+FF1F" to the IDS syntax in Section 12.2 of
the core specification, plus a little text explaining the use of U+FF1F to
indicate the presense of an undescribed component.

Recommendation:

The UTC should approve the described change to the IDS syntax for Unicode 7.0.

===========================================================================

Changes Related to the CD for the 4th Edition

The CD for the 4th Edition is now being progressed to DIS ballot.

The full listing of the additional 4th Edition repertoire for DIS balloting can
be seen in WG2 N4459 (= L2/13-151).

G. Middle Dot

WG2 reached a compromise on the long-fought-over additional middle dot letter.
The compromise consisted of yet another name change and an agreement to make
the dot large enough to not be easily confused for the existing U+00BF MIDDLE DOT.
The current character under ballot in the DIS is:

U+A78F LATIN LETTER SINOLOGICAL DOT

Recommendation:

The UTC should discuss the issue and decide what to do.

H. Name change for Sakha Yat

WG2 approved a name change for U+AB60 LATIN SMALL LETTER SAKHA IOTIFIED A to:

U+AB60 LATIN SMALL LETTER SAKHA YAT

Recommendation:

The UTC should approve this name change.

I. Hungarian --> Old Hungarian

WG2 discussed Hungarian yet another time. This one was rather complicated to
work out, because Hungarian was balloted in Amendment 2, and technically
Amendment 2 passed, with Hungarian NB approval. However, there was information
that made it clear that at least some of those in Hungary who approved
the encoding as shown in Amendment 2 were not that happy about the compromise
script name "Hungarian" and actually preferred "Old Hungarian". But there was
no ballot comment from Hungary to that effect. Rather than risk the FDAM vote
on Amendment 2 with a risky change of the script and character names yet again
without opportunity for a technical vote, WG2 decided to push Hungarian one
more time -- this time into the 4th Edition CD, but with the revised script
and character names "Old Hungarian". This gives one more opportunity for
everybody involved to review and assent explicitly to the name.

One of the upshots for the UTC is that this moved (Old) Hungarian out of the
repertoire being prepared for Unicode 7.0.

Recommendation:

The UTC should approve this name change to the block, script, and the corresponding
changes to all the character names. Note that the "Old Hungarian" script name
was the original one that the UTC approved, quite some time ago, and seems to
be the one most acceptable to all parties except for the group arguing for
some version of "ROVASH" for the name.

J. Sharada: Moving some code points

WG2 reviewed comments on the code points for additions of Sharada punctuation, and
agreed to a few code point moves. The current UTC approval is:

U+111CE SHARADA CONTINUATION SIGN
U+111DB SHARADA HEADSTROKE
U+111DC SHARADA SIGN SIDDHAM
U+111DD SHARADA SECTION MARK-1
U+111DE SHARADA SECTION MARK-2

The revised code points in the 4th Edition DIS are:

U+111DB SHARADA SIGN SIDDHAM
U+111DC SHARADA HEADSTROKE
U+111DD SHARADA CONTINUATION SIGN
U+111DE SHARADA SECTION MARK-1
U+111DF SHARADA SECTION MARK-2

Recommendation:

The UTC should approve these code point changes.

K. Siddham section marks

The UTC is on record as having approved 7 Siddham section marks, with names:

U+115CB SIDDHAM SECTION MARK-2
...
U+115D4 SIDDHAM SECTION MARK-11

These were the clearly attested examples out of a longer list in the proposal.
And the naming of the marks just followed the original generic naming in the
proposal.

WG2 reviewed further feedback on the Siddham section marks, and agreed to
approve the full set of them, with revised, descriptive names, instead of
just numbers. The details of the revised proposal can be found in WG2 N4457.
The revised range and a sample of the names are:

U+115CA SIDDHAM SECTION MARK WITH TRIDENT AND U-SHAPED ORNAMENTS
...
U+115D7 SIDDHAM SECITON MARK WITH CIRCLES AND FOUR ENCLOSURES

for a total of 14 section marks. The full list as approved can be seen
in the DIS repertoire document, WG2 N4459 (= L2/13-151).

Recommendation:

The UTC should approve the revised repertoire, code points, and names, to
get back into synch with the DIS repertoire. (Any further suggestions about
names could then be taken up separately, if desired, for ballot comments.)

L. Siddham letter variants

In response to a Japanese request for a number of variant letters and combining
marks for Siddham, WG2 decided to add a total of 6 variant letters:

U+115E0 SIDDHAM LETTER I VARIANT FORM A
U+115E1 SIDDHAM LETTER I VARIANT FORM B
U+115E2 SIDDHAM LETTER II VARIANT FORM A
U+115E3 SIDDHAM LETTER U VARIANT FORM A
U+115E4 SIDDHAM VOWEL SIGN U VARIANT FORM A
U+115E5 SIDDHAM VOWEL SIGN UU VARIANT FORM A

The UTC has seen the original request, in WG2 N4407R (= L2/13-110), but
declined to take action at the last meeting, pending further feedback and
the outcome of the ballot discussion in WG2. Since then there has been some
contrary feedback, as well. See L2/13-126 and commentary on the unicore list.

Recommendation:

The UTC should discuss and decide what to do.

M. Early Dynastic Cuneiform code point changes

WG2 decided to remove one duplicate Cuneiform code point. This removal was
already approved by the UTC. As anticipated, WG2 then decided to remove
the "hole" in the block and closed up the gap be moving code points. The
revised range is: U+12480..U+12543.

Recommendation:

The UTC should approve the move of the subrange U+124D3..U+12544 to
U+124D2..U+12543.

N. Hatran

WG2 decided to delete HATRAN LETTER RESH and three number signs from the
Hatran block. These changes were responsive to issues raised by the UTC
on the CD. HATRAN LETTER DALETH was renamed HATRAN LETTER DALETH-RESH,
and the gap in the range of letters and numbers was removed by changing several code
points. The revised encoding for the block can be see in L2/13-151.

Recommendation:

The UTC should approve the Hatran block (U+108E0..U+108FF), with 26 characters,
code points, names, and glyphs as shown in L2/13-151.

O. Anatolian Hieroglyphs

WG2 also made a number of changes to the content of the Anatolian Hieroglyphs
block under ballot in the CD, including removal of all reference to
decompositions for the signs. These changes were also responsive to comments
which had been made by the UTC on earlier documents.

Recommendation:

The UTC should approve the Anatolian Hieroglyphs block (U+14400..U+1467F),
with 583 characters in the range U+14400..U+14646, with code points, names,
and glyphs as shown in L2/13-151.

P. CJK Extension E

WG2 made a number of revisions to the repertoire under ballot in the CD
for CJK Extension E, including the removal of 6 characters, moving code points
to get rid of the gaps in the range, and some fixes for associated mapping data.
At this point, the repertoire seems mature enough for UTC approval.

Recommendation:

The UTC should approve the CJK Extension E block (U+2B820..U+2CEAF),
with 5762 characters in the range U+2B820..U+2CEA1, with code points and
glyphs as shown in L2/13-151 (see p. 60ff of the pdf).