Comments on Public Review Issues

L2/25-163

Comments on Public Review Issues
(April 3, 2025 - July 2, 2025)

The sections below contain links to permanent feedback documents for the open Public Review Issues as well as other public feedback as of April 3, 2025 - July 2, 2025, since the previous cumulative document was issued prior to UTC #184 (July 2, 2025).

Issue Name Feedback Link

508 Proposed Update UAX #38, Unicode Han Database (Unihan) (feedback)

509 Proposed Draft UTS #58, Unicode Linkification (feedback)

510 Proposed Draft UTR #59, East Asian Spacing (feedback)

511 Proposed Update UTS #10, Unicode Collation Algorithm (feedback)

513 Proposed Update UAX #50, Unicode Vertical Text Layout (feedback)

516 Proposed Update UAX #44, Unicode Character Database (feedback)

518 Proposed Update UTS #51, Unicode Emoji (feedback)

519 Proposed Update UAX #57, Unicode Egyptian Hieroglyph Database (Unikemet) (feedback)

520 Proposed Draft UAX #60, Data for Non Han Ideographic Scripts (feedback)

521 Combined Registration of the CAAPH collection and of sequences in that collection (feedback)

522 Proposed Update UAX #45, U-source Ideographs (feedback)

523 Proposed Draft UTS #61, Unicode Set Notation (feedback)

524 Proposed Update UAX #41, Common References for Unicode Standard Annexes (feedback)

525 Proposed Update UAX #14, Unicode Line Breaking Algorithm (feedback)

526 Unicode 17.0.0 Beta (feedback)

527 Proposed Update UAX #31, Unicode Identifiers and Syntax (feedback)

528 Proposed Update UAX #42, Unicode Character Database in XML (feedback)

529 Proposed Update UTS #39, Unicode Security Mechanisms (feedback)

530 Proposed Update UTS #46, Unicode IDNA Compatibility Processing (feedback)

531 Proposed Update UAX #29, Unicode Text Segmentation (feedback)

Issue	Name	Feedback Link
508	Proposed Update UAX #38, Unicode Han Database (Unihan)	(feedback)
509	Proposed Draft UTS #58, Unicode Linkification	(feedback)
510	Proposed Draft UTR #59, East Asian Spacing	(feedback)
511	Proposed Update UTS #10, Unicode Collation Algorithm	(feedback)
513	Proposed Update UAX #50, Unicode Vertical Text Layout	(feedback)
516	Proposed Update UAX #44, Unicode Character Database	(feedback)
518	Proposed Update UTS #51, Unicode Emoji	(feedback)
519	Proposed Update UAX #57, Unicode Egyptian Hieroglyph Database (Unikemet)	(feedback)
520	Proposed Draft UAX #60, Data for Non Han Ideographic Scripts	(feedback)
521	Combined Registration of the CAAPH collection and of sequences in that collection	(feedback)
522	Proposed Update UAX #45, U-source Ideographs	(feedback)
523	Proposed Draft UTS #61, Unicode Set Notation	(feedback)
524	Proposed Update UAX #41, Common References for Unicode Standard Annexes	(feedback)
525	Proposed Update UAX #14, Unicode Line Breaking Algorithm	(feedback)
526	Unicode 17.0.0 Beta	(feedback)
527	Proposed Update UAX #31, Unicode Identifiers and Syntax	(feedback)
528	Proposed Update UAX #42, Unicode Character Database in XML	(feedback)
529	Proposed Update UTS #39, Unicode Security Mechanisms	(feedback)
530	Proposed Update UTS #46, Unicode IDNA Compatibility Processing	(feedback)
531	Proposed Update UAX #29, Unicode Text Segmentation	(feedback)

The links below go to locations in this document for feedback.

Feedback routed to CJK & Unihan Working Group for evaluation [CJK]
Feedback routed to Script Encoding Working Group for evaluation [SEW]
Feedback routed to Properties & Algorithms Working Group for evaluation [PAG]
Feedback routed to Emoji Standard & Research Working Group for evaluation [ESC]
Feedback routed to Editorial Working Group for evaluation [EDC]
Other Reports

Feedback routed to CJK & Unihan Working Group for evaluation [CJK]

(None at this time.)

Feedback routed to Script Encoding Working Group for evaluation [SEW]

Date/Time: Tues April 29 12:35:45 PDT 2025
ReportID: ID20250429123545
Name: Rebecca Bettencourt
Report Type: Feedback
Opt Subject: MODIFIER LETTER HIGH AND LOW VERTICAL LINE

 The proposed character MODIFIER LETTER HIGH AND LOW VERTICAL LINE does not look like it belongs in the Superscripts and Subscripts 
 block where it is currently proposed.

Although it is a modifier letter, it is not a superscript or subscript.

It would fit better in Latin Extended-D or -E, which are also in the BMP and already contain modifier letters.

Date/Time: Fri May 02 12:45:34 PDT 2025
ReportID: ID20250502124534
Name: Pepsi Jerking
Report Type: Feedback
Opt Subject: Рассмотрение изменения и добавления символов

Здравствуйте. Прошу вас рассмотреть возможность изменить начертание знаков глаголицы (U+2C00-U+2C5F) на начертание, которое 
употреблено в Киевских листках. Если это не представляется возможным, то прошу отвести для символов глаголицы с начертанием 
по Киевским листкам. Также прошу постепенно добавлять символы кохау ронго-ронго в отведённый диапазон (1CA80–1CDBF). Спасибо 
и успехов в работе.

English translation:
Hello. I ask you to consider the possibility of changing the style of Glagolitic characters (U+2C00-U+2C5F) to the style used 
in the Kyiv leaflets. If this is not possible, then I ask you to allocate the Glagolitic alphabet for the symbols with the style 
according to the Kyiv leaves. I also ask you to gradually add kohau rongo-rongo characters to the allotted range (1CA80–1CDBF). 
Thank you and good luck in your work.

Date/Time: Sun May 18 01:24:55 PDT 2025
ReportID: ID20250518012455
Name: Evie Undis
Report Type: Feedback
Opt Subject: Warning on a suggested name in L2/25-091


Apologies if this is the wrong place to submit this. (If it is, could I please be recommended the appropriate contact avenue?)

In L2/25-091 (SEW recommendations to the April 2025 UTC meeting) section 5.12, it is suggested that the Leibniz radix signs be 
renamed FOURTH ROOT etc. But FOURTH ROOT is already the name of U+221C. I do not have a suggestion for a better name for the proposed characters.

Date/Time: Sat May 31 08:28:22 PDT 2025
ReportID: ID20250531082822
Name: Julian Tseu
Report Type: Report Error in Publication/Data
Opt Subject: Incorrect Character Properties for Phags-pa Lette




To the Unicode Technical Committee,

I am writing to report a critical discrepancy between Unicode’s current character properties for two Phags-pa (ʼPhags-pa) script characters 
and their historically attested usage in primary sources. Specifically:

ꡎ (U+A84E) is currently assigned the sound value /b/ (voiced bilabial stop) in Unicode documentation.
ꡍ (U+A84D) is assigned /pʰ/ (aspirated voiceless bilabial stop).
However, extensive philological evidence from Yuan Dynasty (1271–1368) sources—including the seminal rhyme dictionary 《蒙古字韵》 
(Měnggǔ Zìyùn, c. 1269–1308)—demonstrates that these assignments are reversed:

《蒙古字韵》 explicitly categorizes:
ꡎ (U+A84E) as "滂" (pāng), representing aspirated /pʰ/ (e.g., in Chinese words like "坡" pō).
ꡍ (U+A84D) as "並" (bìng), representing voiced /b/ (e.g., in words like "波" bō).

Sincerely,
Julian Tseu

Date/Time: Fri June 27 18:05:11 PDT 2025
ReportID: ID20250627180511
Name: Michael Macha
Report Type: General Feedback
Opt Subject: Encouragement of inclusion of breath marks

I am a comic book letterer. It's true that most of my work is concluded by hand, using a stylus; this allows for certain graphic-novel appropriate 
levels of emphasis which would be out of place in a less visual medium; like controlled weighting. However, I heavily use modern technology to aid 
the speed of my work, and often box lettering out using a sans, often monospace, unicode font. This allows for rapid alignment and composition of 
the text, saving time, effort, and money.

One thing which regularly comes up in my work is the breath mark. ⚞ and ⚟, code point U+269E and U+269F, are the closest thing I've been able to 
find to the breath mark; but they aren't terribly usable in practice due to space requirement (they're built more like emoji) and their generally 
having the wrong angle to them.

I note that Andrés Sanhueza, in L2/19-073, has proposed assigning code points to breath marks; and the document's portrayal of their appearance 
is perfect for my work. It would take a lot of the guess work out of an already emphasis-controlled and dramatic piece of line work, allowing me 
to allocate appropriate space for each breath mark.

I encourage you to adopt this proposal, as unicode is a major part of my workflow. Breath marks are industry recognized, function much as punctuation, 
and are recognized by our readers instinctively.

Date/Time: Tues July 01 13:00:21 PDT 2025
ReportID: ID20250701130021
Name: Lorna Evans
Report Type: Report Error in Publication Data
Opt Subject: Syriac chapter lists usage of U+0320


The Syriac chapter says that U+0320 is used in Syriac:
https://www.unicode.org/versions/Unicode16.0.0/core-spec/chapter-9/#G45628

This information seems to have come from the original Unicode proposal on page 46. http://www.unicode.org/L2/L1998/98050-syriac-proposal.pdf

However, as I've worked on an Eastern Syriac font, I was clearly informed that U+0331 (COMBINING MACRON) is the character to use. 

Logically, it would make sense to pair:

0304 COMBINING MACRON
0331 COMBINING MACRON BELOW

rather than 

0304 COMBINING MACRON
0320 COMBINING MINUS SIGN BELOW

I think Table 9-14 in the core-spec should be changed from listing U+0320 to U+0331. 

This would also mean changing ScriptExtensions.txt from:

0320 ; Latn Syrc # Mn COMBINING MINUS SIGN BELOW

0331 ; Aghb Cher Goth Latn Sunu Thai # Mn COMBINING MACRON BELOW

to:

0320 ; Latn # Mn COMBINING MINUS SIGN BELOW

0331 ; Aghb Cher Goth Latn Sunu Syrc Thai # Mn COMBINING MACRON BELOW

My Syriac script contact is Sargon Hasso who is listed on the original Unicode proposal on page 3, #1 as one of the contacts with the user community. 
I also checked several of the Meltho fonts developed by George Kiraz, and those fonts also use U+0331 and not U+0320.

Feedback routed to Properties & Algorithms Working Group for evaluation [PAG]

Date/Time: Sat April 26 21:13:21 PDT 2025
ReportID: ID20250426211321
Name: Crystal Durham
Report Type: Report Error in Publication Data
Opt Subject: tr31-41 standard math profile seems incorrect


Quoting the relevant text from UAX#31§7.1:

The Mathematical Compatibility Notation Profile for default identifiers [...] is associated with a profile for UAX31-R3b, which consists of removing the 
characters in [[:Pattern_Syntax:] - [:ID_Compat_Math_Continue:]] from the set of characters with syntactic use (these are the characters ∂, ∇, and ∞).

This seems incorrect. The - here is a set difference operation, meaning that this standard profile as described asks to remove all syntax characters 
except those in [:ID_Compat_Math_Continue:]. Given that the parenthetical identifies that the removed characters are ∂, ∇, and ∞, I suspect that the 
set union operator & was supposed to be used instead. Thus the text should read

[...] removing the characters in [[:Pattern_Syntax:]&[:ID_Compat_Math_Continue:]] from the set [...]

or

[...] excluding the characters in [:ID_Compat_Math_Continue:] from the set [...]

instead.

Date/Time: Fri May 9 13:28:09 PDT 2025
ReportID: ID20250509132809
Name: Ned Holbrook
Report Type: Report Error in Publication Data
Opt Subject: Error in Permuting Combining Class Weights example

https://www.unicode.org/versions/Unicode16.0.0/core-spec/chapter-5/#G36537 attempts to give an example of “permut[ing] weights to have U+0651 ARABIC SHADDA 
precede all vowel signs” that appears to be incorrect: assuming that the internal weights behave the same as combining classes such that lower values appear 
first, I believe a correct example would remap shadda (CCC 33) to a weight of 27 (rather than 32, as shown). If so, https://unicode.org/faq/normalization.html#9 
also needs to be corrected.

Date/Time: Thur May 15 16:11:17 PDT 2025
ReportID: ID20250515161117
Name: President International Council on English Braille Judith Dixon
Report Type: Report Error in Publication Data
Opt Subject: U+2800 Braille pattern blank is categorized as Os

U+2800 is categorized as Os, Other symbol so does not behave like a space. ICEB, the standard-setting body for English language braille publishes codebooks. 
U+2800 does not break words. We must use U+0020 which is not sized correctly. It would be better if U+2800, Braille Pattern Blank, which is always used as 
a space was categorized as Zs.

Date/Time: Fri June 13 11:31:46 PDT 2025
ReportID: ID20250613113146
Name: Jeff Davis
Report Type: Report Error in Publication Data
Opt Subject: Misleading text in SpecialCasing.txt

Regarding a question I asked about ICU here: https://groups.google.com/a/unicode.org/g/icu-support/c/n09HXzVZCWk"

The maintainers suggested that the text in SpecialCasing.txt about the treatment of U+0345 when uppercasing or titlecasing is misleading, and that I should 
file a report. I don't claim expertise in this area, but I am confused about where, how, and under what conditions the rule in SpecialCasing.txt should be applied.

Specifically:

(a) It's listed under "Unconditional mappings", but ICU only seems to make use of the rule when using a Greek locale; and
(b) Moving characters around doesn't seem to be permitted by toUppercase(X) as defined in 16.0 3.13.2 R1. More generally, moving characters around doesn't seem to 
be a "mapping" as I understand it.

Feedback routed to Emoji Standard & Research Working Group for evaluation [ESC]

Date/Time: Sat June 14 13:05:09 PDT 2025
ReportID: ID20250614130509
Name: Sebastian Norr
Report Type: Emoji feedback
Opt Subject:

Suggestion to improve Unicode emoji standardization.

Problem:
I have for a long time been annoyed that some emojis are turned the wrong direction, this lack of control makes it impossible to create some reasonably 
looking emoji combinations, 

for example:
This could be interpreted as "the horse says" or "from the horses mouth".
BUT the horse it turned to the left, and the speech bubble is turned to the right, so even if you swap the order of them

it still looks wrong, and there are nothing I can do to make it right, (except for writing this email).

Solution:
The solution is as simple as adding a swap-function.
there is no need to create an extra set of emojis, that doubles the amount of character codes & requires even more memory to be used to store an 
exact copy of ALL the emojis. No, SIMPLY read the image-data in reverse order (right-to-left, instead left-to-right (or however the data is normally 
read)), in the same way you can also mirror & flip the emoji upside-down too.

So by just reading the data in different directions, you have instantly created FOUR times as many emojis.
----
Ok, but how do I in the back-end symbolize that this is the kind of emoji that will be flipped & in what way?

Technical explanation:
From what I understand there are numerical codes that get sent (eg. 1 222 = or something)
Then there are more complex data that require 2 (or more) numbers to represent that symbol (eg. Asian characters), that is done something like this 
(2 2222 3333) where 2 represents a identifier that says that this is a multi-number symbol, and it is a 2 meaning that the number consist of 2 
segments of numbers (of course, that is zero-indexed, so just do a -1 on this example, but this is easier to read for humans).

So, from that example, the new code could be some variation like this:
(2 3333 L) 2 = 2-segments, 3333 = ID-code & L=Left (mirrored)
Some other letter suggestions:
L / R / U / D = Left / Right / Up (normal, omitted?) / Down (upside-down)
or just: M = mirror (sideways), and maybe: U = Upside-down mirroring.
Or anything else, when someone comes up with a smarter solution.
----
The following allows for more complexity, but probably isn't so useful because pixel data becomes distorted when you rotate it anything else than 
in complete 90-degree angles, but it is worth considering anyway:

The rotation-code could even be in degrees, from 0 to 359 degrees AND you could use negative numbers to indicate that the rotation should be 
flipped upside-down too.

For example:
(note from the future: I had a little thought error in this example, I have mirrored across the up/down axis, even though I wrote above "flipped 
upside-down", but the example should still get the point across.)
(note, I'm doing my best within the current limitations, I can't simulate the complete rotation of the edge of the emoji: ???? --> )
⬆️ = baseline arrow (imagine a red dot  in top right corner: 45 degree, to break symmetry)
️ = rotation data: +90
↗️ = rotation data: +45
↖️ = rotation data: +315 (not mirrored, = top at 0 degrees)
↖️ = rotation data: -45* (the arrow is mirrored, the is at 270 degrees)
*The emoji has first been rotated by +45 degree and then been mirror flipped, the order of operations are important, if you invert the order of 
operations, then the result becomes: ↗️ and at 0 degrees.

Maybe the rotation code could be something like:
+45 D = first rotate by 45 degrees and THEN upside-Down-mirror the result.
and
D +45 = upside-Down-mirror the result THEN rotate it by 45 degrees?
because that gives indifferent results, and might allow for more efficient rotations somehow?
----

Hope this wasn't too confusing? (especially that last part?)
I still think it would be very beneficial to be able to rotate emojis to be the correct way around for whatever we need to use them for.
(I also have a future suggestion for stacking emoji-components to create custom emojis, but that's a suggestion for later, I still need 
to think some more about it).

Hope this is useful and can be implemented in some (improved) fashion in a not too distant future.
Have a nice day!

Feedback routed to Editorial Working Group for evaluation [EDC]

Date/Time: Fri May 30 14:17:22 PDT 2025
ReportID: ID20250530141722
Name: Matthew King
Report Type: Report Error in Publication Data
Opt Subject: Missing example in figure 2-5

Unless my PDF viewers are faulty (I tried two on different computers/OSs) Example 5 in figure 2-5, which should be horizontally rendered 
Japanese text, is missing.

It is slightly amusing that the examples read "Please see page 1123" (presumably the other languages which I don't speak say the same thing). 
Perhaps in review it was misunderstood as an actual reference? Page 1123 does not appear to be relevant.

This is in Unicode 16. The other version I happen to have is 14 which does display example 5 correctly. I am unsure about 15.

Other Reports

(None at this time.)

L2/25-163