L2/24-225
Source: Editorial Working Group
Date: October 24, 2024
FYI: The Unicode 16.0 core specification was published as scheduled on September 10, 2024, along with the Version 16.0 of the Unicode Standard, including all its associated annexes and correlated Unicode Technical Standards.
FYI: The Editorial Committee has started review of new content planned for the eventual 17.0 publication of the core specification. There is also ongoing work to do routine upkeep of the core specification and to stay current with bug reports and other small tweaks to core specification content mandated by the UTC.
The Editorial Working Group continues its periodical review and general maintenance of Unicode web pages, both out of its own initiative and public feedback.
We are currently looking into a potential redesign of the TUS landing page, especially for the purpose of improving access to the core specification from the page (per feedback ID20241006161903), with a possible backport of these improvements to the 16.0 TUS landing as well. The Editorial Working Group plans to coordinate with the Release Management Group on this matter.
We have updated the private unicode-org FAQ repository so that the source is maintained directly from the repository and deployed from it.
The FAQ pages are automatically glossarized during deployment.
FYI: The Editorial Working Group continues to meet regularly. Our meetings are generally held on a biweekly schedule, except when holidays or other events coincidence, such as UTC meetings. This report to the UTC includes feedback from the Editorial Working Group meetings held on August 1, 2024, August 15, 2024, August 29, 2024, September 12, 2024, September 26, 2024, October 10, 2024, and October 24, 2024.
FYI: Public-facing information about the Editorial Working Group and its work is maintained on the Unicode Editorial Working Group Page on the website. The Editorial Working Group also maintains an internal subsite for use by the committee. People who would like to find out more about the work of the Editorial Working Group or contribute to that work should contact the Chair, Louka Ménard Blondin (louka@unicode.org).
In the past, we have maintained a separate series of TUS Futures meetings in order to discuss improvements to the presentation of our web pages and some of our more public-facing documents. With the core of this work done and fully rolled into the distribution of the Unicode 16.0 core specification, we have discontinued TUS Futures as separate meetings.
The Editorial Working Group is in ongoing need of volunteer editors with copyediting experience. People who are interested in learning more about this work and potentially take it up should contact the Chair for more information.
Work is ongoing on improving the public documentation about the Editorial Working Group for potentially interested contributors both inside and outside of Unicode. We eventually plan to document and chart the internal processes of the committee to help newcomers better understand our work.
FYI: During this cycle, the Editorial Working Group has been lightly reviewing UAXes and UTSes.
Date/Time: Fri Jul 26 02:41:07 CDT 2024 ReportID: ID20240726024107 Name: Werner Lemberg Report Type: Error Report Opt Subject: NamesList.txt
xxxxxxxxxxAs discussed in the thread starting athttps://corp.unicode.org/pipermail/unicode/2024-July/010976.htmlit turned out that the two characters1D132 MUSICAL SYMBOL QUARTER TONE SHARP1D133 MUSICAL SYMBOL QUARTER TONE FLATare not accidentals but *pitch modifiers*, to be added to left of anaccidental (or a note without an accidental) and indicating that the pitchof the given note has to be raised or lowered by a quarter tone,respectively. The provided scans in the discussion confirm this usage.In other words, these two characters should be put into a separate section`@ Pitch modifiers` or something like that.
Action Item for Ken Whistler, EDC: Update the NamesList.txt with pitch modifiers subhead at 1D132 for Unicode Version 17.0. [Reference: Section E1 of L2/24-225]
Date/Time: Thu Aug 08 09:06:48 CDT 2024 ReportID: ID20240808090648 Name: Lucas Report Type: Error Report Opt Subject: Multiple
xxxxxxxxxxThe Latin Letters D, K, L, N and R as used in Livonian, Old-Prussian,Latvian and Romanian (all around the Baltic area) are supposed to have acomma underneath, and not a cedilla. I have not found a single source thatneeds these letters with an actual cedilla, other than errors caused byyou, Unicode. According to Wikipedia these letters were mistakenly encodedwith a Cedilla by Unicode in the early nineties, and that Unicode claimsthese errors can not be fixed, (even though, in general, the computer worldis all about bugfixing). These letters should not combine with 0327, butwith 0326, as you probably know, since the font used in your charts shows aproper comma-accent. The Calibri font fonts I designed also use commaaccents.Your Unicode-bugs are the cause of many fonts actually using cedillasinstead of comma accents. Your bug has also caused the recent DIN 91379Norm to include sequences for these letters combined with 0326 commaaccent, instead of using the existing Unicodes of the precomposed letters.If you, for whatever reason, refuse to fix the bugs introduced by yourpredecessors, than at least add notes to ALL of these 10 codepoints, inyour charts, that this was a historic mistake, and that the accents shouldactually look like free floating comma accents (0326) and not cedillas(0327).1E10 Ḑ LATIN CAPITAL LETTER D WITH CEDILLA (0044 + 0327)1E11 ḑ LATIN SMALL LETTER D WITH CEDILLA (0064 + 0327)0136 Ķ LATIN CAPITAL LETTER K WITH CEDILLA (004B + 0327)0137 ķ LATIN SMALL LETTER K WITH CEDILLA (006B + 0327)013B Ļ LATIN CAPITAL LETTER L WITH CEDILLA (004C + 0327)013C ļ LATIN SMALL LETTER L WITH CEDILLA (006C + 0327)0145 Ņ LATIN CAPITAL LETTER N WITH CEDILLA (004E + 0327)0146 ņ LATIN SMALL LETTER N WITH CEDILLA (006E + 0327)0156 Ŗ LATIN CAPITAL LETTER R WITH CEDILLA (0052 + 0327)0157 ŗ LATIN SMALL LETTER R WITH CEDILLA (0072 + 0327)ASAP please, thank you.
Action item for Ken Whistler, EDC: Consider adding annotations to NamesList.txt for the case pairs 1E10/1E11, 0136/0137, 013B/013C, 0145/0146, 0156/0157, for example:
Despite the name, this pair of characters should normally be displayed with a comma below
Date/Time: Sat Aug 31 22:24:11 CDT 2024 ReportID: ID20240831222411 Name: Guillaume Fortin-Debigaré Report Type: Error Report Opt Subject: Unicode 15.1.0 Core Specifications - Chapter 22 Symbols
xxxxxxxxxxTable 22-5 "Mathematical Operators Disunified from Punctuation" lists the incorrectUnicode code point for the SOLIDUS character in the second row of the left column.If should be 002F instead of 003F.
This error has been fixed in the Unicode 16.0 core spec.
Date/Time: Sat Sep 07 05:14:42 CDT 2024 ReportID: ID20240907051442 Name: Ivan Panchenko Report Type: Error Report Opt Subject: U0000.pdf
xxxxxxxxxxA minor slip: In U0000.pdf, the following is shown with two right single quotationmarks (they are not ASCII apostrophes!) instead of a left and a right one:for ’Greek question mark’
This, as well as other examples, has been fixed, and will become visible when the first 17.0 version of NamesList.txt starts to surface.
Date/Time: Wed Sep 11 04:07:14 CDT 2024 ReportID: ID20240911040714 Name: Ivan Panchenko Report Type: Error Report Opt Subject: U2100.pdf
xxxxxxxxxxThere are two issues with the informative aliases “first transfinitecardinal (countable)”, “second transfinite cardinal(the continuum)”, “thirdtransfinite cardinal (functions of a real variable)” and “fourthtransfinite cardinal” for the characters U+2135 (ALEF SYMBOL), U+2136(BET SYMBOL), U+2137 (GIMEL SYMBOL) and U+2138 (DALET SYMBOL),respectively.1) Aleph is used together (!) with 0, 1, … as an index to indicatecardinalities of well-ordered infinite sets (in ascending order).(Without an index, it is apparently sometimes used for the cardinality ofthe continuum, not the first transfinite cardinal!) Beth and gimel are alsoused with an index (you can look up the definition), while daleth does nothave an established meaning and was apparently just included in LaTeX sothat it can be used in an ad-hoc manner. (Even if there is someone outthere who uses the characters as the aliases indicate, that would be anidiosyncrasy that does not deserve mention in the only alias.)2) That the cardinality of the continuum is the second transfinite cardinalamounts to the continuum hypothesis, which is known to be independent ofthe set theory ZFC, and among those set theorists who have a belief eitherway, it seems like most believe it to be false.
Action item for Ken Whistler, EDC: Drop the aliases from characters from U+2135 to U+2138, changing These are left-to-right characters. to These are left-to-right characters. They are used in notations of transfinite cardinals. [Reference: Section E1 of L2/24-225]
Date/Time: Wed Sep 11 05:14:22 CDT 2024 ReportID: ID20240911051422 Name: Ivan Panchenko Report Type: Error Report Opt Subject: (none)
xxxxxxxxxxTwo further remarks:1) The reference glyph for U+3388 and that for U+3389 have anitalicized “cal” for the calorie. This unit symbol should not beitalicized. While the glyphs are not normative, it would be great if thiscould be corrected; an italic mu (in glyphs of the chart) has already beencorrected to an upright one.2) The character U+2263 (≣ STRICTLY EQUIVALENT TO) is found under thesubhead “Relations”. I think it would be more appropriate to put itunder “Logical operator” (for comparison: U+2227) because it stands for aconnective in modal logic. See here:https://corp.unicode.org/pipermail/unicode/2022-July/010231.html
This should be forwarded to the CJK group.
We recommend no action.
https://corp.unicode.org/pipermail/unicode/2022-July/010231.html focuses on the semantics of one particular usage in symbolic logic. Perhaps the submitter’s focus on that is inspired by the logic-sounding character names, but character names for mathematical symbols cannot reflect the breadth of their use, which is « whatever mathematicians feel like ». Cursory searches find https://math.stackexchange.com/a/2788325 or https://www.reddit.com/r/askmath/comments/68i9on/is_there_a_conventional_use_for_the_strictly/ which mention uses interchangeable with other =-family characters. More importantly, in the context of mathematical typesetting, relations and (binary) operators are also typographical categories, affecting e.g., spacing. MathClassEx, revision 15, classifies ≣ as R, like ≡ et al.; and it would be very weird to see ≡ typeset differently from ≣.
Date/Time: Tue Sep 24 06:11:31 CDT 2024 ReportID: ID20240924061131 Name: Ben Harris Report Type: Error Report Opt Subject: The Unicode® Standard Version 16.0 – Core Specification
xxxxxxxxxxA piece of text has been lost in the translation to HTML for Unicode 16. InUnicode 15.1.0, this text appears:"So for example, the representation of the number 12,346 in the traditionalsystem would be by a sequence of CJK ideographs with numeric values asfollows: <one, ten-thousand, two, thousand, three, hundred, four, ten,six>."That is, the example is "one, ten-thousand, two, thousand, three, hundred,four, ten, six", surrounded by less-than and greater-than signs.In Unicode 16.0.0, athttps://www.unicode.org/versions/Unicode16.0.0/core-spec/chapter-22/#G46185,the same sentence reads:"So for example, the representation of the number 12,346 in the traditionalsystem would be by a sequence of CJK ideographs with numeric values asfollows: ."That is, the entire text within and including the less-than and greater-thansigns has vanished. The HTML source shows that the text does actuallyappear in the source, but the less-than sign has not been properly escapedand so is interpreted as markup by browsers.This makes me suspect that there may be other similar problems elsewhere inthe standard. I haven't (yet) made any attempt at looking for them.
This has been fixed in the 17.0 core specification draft.
Date/Time: Fri Oct 04 11:39:10 CDT 2024 ReportID: ID20241004113910 Name: Malo Report Type: Error Report Opt Subject: The Unicode® Standard Version 16.0 Core Specification
xxxxxxxxxxSection 24.1.9 of the Unicode® Standard Version 16.0 Core Specification(https://www.unicode.org/versions/Unicode16.0.0/core-spec/chapter-24/#G3725)includes sample character list which contains a mistake: 212B Å ANGSTROMSIGN is incorrectly marked as having the canonical mapping 00C5 Å angstromsign, instead of 00C5 Å latin capital letter a with ring above. Note thatthis error is not present in the corresponding chart(https://www.unicode.org/charts/PDF/U2100.pdf).
This has been fixed in the 17.0 core specification draft.
Date/Time: Sun Oct 06 16:19:03 CDT 2024
ReportID: ID20241006161903
Name: Jim DeLaHunt
Report Type: Error Report
Opt Subject: www.unicode.org/versions/latest/
xxxxxxxxxxPassing on a social media comment about page athttps://www.unicode.org/versions/latest/ . Reader visits the page wantingto find the Core Spec (can generalise other parts of the Unicode Standardsuch as UTRs). Reader expects that the page will contain links to the partsof the core spec which they seek. Instead, the page describes thedifferences between the latest version of TUS and the previous version. Isuggest adding a section to the top of this page, describing "The currentversion of The Unicode Standard is 16.0.0. It consists of a CoreSpecification (link), some Code Charts (link), etc. Then put the currentcontent under a heading like "Differences from previous version of theStandard".The present set of links, especially the unnumbered list of links under "B.Technical Overview", might make the reader hope they link to the parts ofthe Standard, but in fact they link to subheadings below which describechanges. It would be better for the list of links at the top of the page beto the parts of the latest version of The Unicode Standard, as implied bythe URL.Original social media post:https://cosocial.ca/@timbray/113170595870924709 , by Tim Bray of XML fame.Relayed by Jim DeLaHunt. The explanation above is mine, not Tim's. He maysubmit his own Error Report in his own words.
Note that the tech site home page has already been improved to include a direct link to the latest core spec.
The Editorial Working Group will try to improve the 17.0 landing page, and potentially retrofit the 16.0 one.
Date/Time: Thu Oct 24 10:04:37 CDT 2024 ReportID: ID20241024100437 Name: Sridatta A Report Type: Error Report Opt Subject: Corrections to Unicode chapter of Tulu-Tigalari
xxxxxxxxxxIn chapter 15https://www.unicode.org/versions/Unicode16.0.0/core-spec/chapter-15/#G71814“Tulu-Tigalari is a historic script attested in a large number of manuscriptsfrom Karnataka and northern Kerala dating to as early as 1300 CE. It was usedto write Sanskrit, Tulu, and Malayalam, “Should be corrected to have Kannada instead of Malayalam.In #Figure 15-5. The glyph is that of ju than chu
This has been fixed in the 17.0 core specification draft.
Nothing to report.