Editorial Committee Report

L2/20-241

Editorial Committee Report and Recommendations for UTC #165 Meeting

Source: Editorial Commitee

Date: October 4, 2020

A. Unicode Release Topics

A1. Unicode 14.0 Schedule and Planning

FYI: The significant milestones for the Unicode 14.0 release are:

Beta start: June 4, 2021

Beta close: July 20, 2021

Release: September 14, 2021

These dates are unchanged from those reported in the Editorial Committee Report and Recommendations for UTC #164 Meeting.

The Editorial Committee is also recommending that the UTC aim to establish the planned repertoire for Unicode 14.0 by its January, 2021 meeting. That will give a sufficient window for public review and feedback before code points and character names are all locked down for the start of beta review in June, 2021. See topic A2 below.

A2. "Alpha" Pipeline Charts and Early Review for 14.0

FYI: We have made no progress on getting Alpha charts posted. This work will presumably be done after any determinations of new character additions at the October UTC meeting.

A3. French Charts Announcement

FYI: Michel has produced a set of French code charts for Unicode 13.0, based on some very extensive translation work by Patrick Andries and others over the years, to prepare a complete Unicode 13.0 NamesList.txt translated into French. These charts consist of a complete, single archival Unicode 13.0 code chart in French, and a set of block-by-block charts. (In our Editorial Committee parlance, the latter are known as the "unversioned" charts, as they are not archived for a particular version, but instead reflect the kind of block-by-block charts we post for the latest version.)

The availability of these Version 13.0 French code charts was announced on August 20. The charts themselves are now live on the site: French code charts. There is also some contextualizing information about the French code charts added to our existing help page for the code charts, to explain the informative status of French translations of Unicode character names.

It is currently unclear whether this set of French code charts will be a one off for Unicode 13.0, or whether a regular update can be eventually integrated into the release process for the Unicode Standard for future versions.

B. Website Topics

B1. Website Status

FYI: The technical website is stable and completely accessible now. The longstanding issue of the Unicode tools jsps not being functional has been addressed. The jsps (which underlay the pages which provide Unicode set and property information, bidi and normalization demos, etc.) have been rebuilt and deployed. See for example:

Unicode Utilities: BIDI (UBA)

A robust backup scheme is now in place for both the public (technical) website and the internal corporate website, which is where the Editorial Committee work is done. We should be less at risk now of the kind of incident which happened during the catastrophic VM crash in April, 2020.

B2. Website Content Maintenance

FYI: The Editorial Committee is working on a complete analysis of the technical website content, so that content ownership can be rationalized and a more systematic approach to ongoing maintenance can eventually be developed.

C. Process Issues

Nothing new to report at this time.

D. UTR Topics

D1. UTS #51, Unicode Emoji

FYI: Version 13.1 of UTS #51, Unicode Emoji, was published on September 18 as part of the release of Emoji 13.1. Emoji 13.1 added 217 new emoji sequences, but no new emoji characters. There were only minor updates made to the text of UTS #51.

E. PRI Topics

E1. Editorial Feedback on open PRIs for documents

FYI: The Editorial Committee has no new feedback on open PRIs at this time.

F. Responses to Public Feedback

FYI: The Editorial Committee has reviewed the general public feedback routed for its consideration in the UTC #165 Comments on Public Review Issues document: L2/20-239. The exact text of all that editorial feedback can be referred to in L2/20-239. The short summaries below simply reference the authors and dates of the feedback, giving any relevant conclusions from the discussion. The suggested action items are queued up below the discussion section.

Discussion:

David Corbett (July 21): This observation makes sense. Refer to sources such as medieval punctuation for clarification. The intent of punctus elevatus appears to be to indicate an intermediate pause where the sensus is complete, but the sentence is not. (That is similar in some respects to the way semicolon is used as modern punctuation.)

Ajith (July 29): This reports a typo in the name of U+0BA9, cited in the Malayalam section (12.9) of the core specification. This typo is noted and will simply be corrected by the editor. There is no need to record a separate action item for that change.

Ajith (July 29) Candrakkala Examples: This feedback reports a valid concern. The Malayalam candrakkala examples cited in the TUS 13.0 text are not regular forms, but they are attested. We recommend that explanatory text be added to provide more information about the context of this usage.

Ajith (July 30): This feedback claims that the representation of two-part vowels in three Malayalam examples in the TUS 13.0 text should use the precomposed form, instead of the decomposed form of those vowels. Our assessment is that there is a valid rationale for the sequences used in those examples, so we suggest the text of the core specification be augmented to provide that rationale.

Peter Constable (July 30): These two sets of feedback from Peter Constable make specific text suggestions for clarification in the introduction and in Section 3.1 of UTS #39. The Editorial Committee agrees that the text additions and clarifications would be helpful and suggests that Peter add his drafts to a proposed update for UTS #39 for Version 14.0. Note that there are other actions related to preparation of a proposed update of UTS #39 for 14.0 (see 164-A47, 162-A57), so this textual work will need to be coordinated with Mark Davis.

Peter Constable (July 30): These two sets of feedback from Peter Constable suggest very specific small text changes to the text of UAX #31 in two locations in Section 1 of the specification. The Editorial Committee agrees that these would be helpful updates to the text, and should be done in the next revision of the text, in a proposed update for Version 14.0. The text changes are small and explicit, so could be undertaken by Ken Whistler simply as part of the normal "boilerplating" of the UAX for a proposed update.

David Starner (Aug 24): This feedback notes that Table 22-4, Compatibility Digits omitted a row for the Segmented Digits (U+1FBF0..U+1FBF9). This observation is correct. The feedback has been passed on to the editor, who has already corrected the table in the draft of the core specification for Version 14.0. There is no need to record a separate action item for that change.

Roozbeh Pournader (Aug 24): This feedback notes a typo in the latest approved version of UAX #14. This typo has already been correct in the proposed update of UAX #14 for Version 14.0, so no action item needs to be recorded.

Norbert Lindenberg (Aug 29): This feedback suggests gathering together various information about Devanagari clusters in Section 12.1, Devanagari, of the core specification, in order to provide regular expressions covering all the possible cluster patterns. The Editorial Committee thinks that doing so might well be helpful, but that providing a full framework for Devanagari syllabic structure involves technical issues and cannot simply be handled with editorial content changes. We suggest that Norbert Lindenberg write up his analysis into a formal proposal and then submit it to the Script Ad Hoc for further discussion and technical review by Indic script experts.

Suggested action items:

EC-UTC165-R1: The Editorial Committee recommends recording the following action items.

AI. Ken Whistler, Ed Committee. Clarify the names list annotation regarding punctus elevatus (U+2E4E) for Unicode 14.0. Ref. David Corbett, July 21, in L2/20-239. [Tue Jul 21 13:06:27 CDT 2020]

AI. Liang Hai, Ed Committee. In Section 12.9, Malayalam, of the core specification, provide clarification about the attestations of candrakkala (U+0D4D) in some irregular forms. For Unicode 14.0. Ref. Ajith, July 29, in L2/20-239. [Wed Jul 29 23:33:52 CDT 2020]

AI. Liang Hai, Ed Committee. In Section 12.9, Malayalam, of the core specification, provide an explanation of the rationale for use of decomposed sequences in two-part vowels in the examples in Table 12-41. For Unicode 14.0. Ref. Ajith, July 30, in L2/20-239. [Thu Jul 30 01:19:40 CDT 2020]

AI. Peter Constable, Ed Committee. Prepare proposed update text for UTS #39 for Version 14.0, incorporating textual suggestions noted in L2/20-239. Ref. Peter Constable, July 30. [Thu Jul 30 15:56:14 CDT 2020, Thu Jul 30 16:27:43 CDT 2020]

AI. Ken Whistler, Ed Committee. Prepare proposed update text for UAX #31 for Version 14.0, incorporating specific textual suggestions noted in L2/20-239. Ref. Peter Constable, July 30. [Thu Jul 30 16:47:52 CDT 2020, Thu Jul 30 17:11:37 CDT 2020]

AI. Rick McGowan. Post a PRI for the proposed update of UAX #31 for Unicode 14.0, with a close date of 2020-12-NN.

G. Miscellaneous Topics

G1. (None noted)