Archive of Notices of Non-Approval

Unicode Pipeline

Tech Site | Site Map | Search

Archive of Notices of Non-Approval

This page archives significant decisions by the Unicode Technical Committee not to approve certain proposed characters.

Notices of non-approval decisions are numbered for reference, and are listed roughly in reverse chronological order, so that more recent decisions are at the top of the list. The reference number for each notice consists of the year of the original decision, followed by a dash and another digit to indicate sequence order. The date shown at the right of each notice is the posting date for that notice in this archive. Posting dates for a notice may occasionally be updated, whenever the UTC revisits a decision of non-approval and updates the decision in some way. Significant dates related to the actual UTC decision(s) involving a non-approval are listed in the disposition section of each notice.

Note that the UTC routinely declines to approve various character proposals at a particular meeting. In most instances such decisions are simply part of the ongoing process of feedback, revision, and review of exploratory or otherwise incomplete proposals. Such ongoing proposal review does not constitute formal non-approval, and is not tracked on this page.

Occasionally, however, the UTC makes a formal decision to reject a character proposal, for a variety of architectural reasons, or because of other serious defects in the proposal. Such formal decisions result in a notice of non-approval posted on this page, and signal the intent of the UTC not to pursue further work on that proposal. In some instances such a formal decision is also designated by the UTC as constituting a precedent (see Section 10.5.2, "Precedents" in the Technical Group Procedures); any precedent would require a special majority in the UTC to be reconsidered for a change of decision at a later time. Any notices of non-approval which also constitute UTC precedents are explicitly identified as such in this archive.

2019-1 Variation Selector for Italics 2019-Jul-01

Proposal to designate a variation selector to represent italic text. (See L2/19-063, L2/19-195.)

Disposition: The UTC rejected the proposal. The representation of italic text has been considered a style issue, to be handled by text markup (or other techniques outside the scope of plain text), since the very inception of the Unicode Standard. Introduction of variation sequences purporting to represent italic text would only introduce ambiguity into text representation and confusion in implementations. The existence of special-purpose italic-styled characters for use only in limited mathematical contexts is explicitly not to be construed as a precedent for the introduction of more italic characters in plain text, either as atomically encoded characters or as variation sequences. Furthermore, variation selector characters are not to be used for and will not be designated for textual effects which are inherently scoped across spans of text, as is the case for italic styling. References to UTC Minutes: [159-C24], May 2, 2019.

2017-1 DECIMAL SEPARATOR 2017-Oct-25

Proposal to encode a character DECIMAL SEPARATOR, distinct from U+002E FULL STOP. (See L2/17-324.)

Disposition: The UTC rejected the proposal. For decades, software has interpreted characters such as U+002E FULL STOP (= period) and U+002C COMMA (depending on the user’s exact settings) in human-computer interaction. The details of this interpretation is not in the scope of a character encoding (see CLDR). Furthermore, unambiguous interchange is already served by either non-textual binary formats, or by the use of Unicode text using specifically agreed-upon code points. Encoding a distinct character now for a decimal separator would only create confusion and ambiguity in numeric representation. References to UTC Minutes: [153-C33], October 25, 2017.

2013-2 SQUARE WITH SPECKLES FILL 2013-Feb-06

Proposal to encode a character SQUARE WITH SPECKLES FILL. (See L2/12-317.)

Disposition: The UTC rejected the proposal. The reason given in L2/12-317 for the encoding of this character was that the existing mapping of U+2592 MEDIUM SHADE to the Korean standard KS X 1001 was incorrect, because similarly shaped characters in KS X 1001 (and Code Page 949) were mapped to graphic characters from the Geometric Shapes block with names SQUARE WITH HORIZONTAL FILL, etc. However, the existing mapping has longstanding implementation practice. The encoding of a new character, whose only justification would be to change the existing mappings for Korean, would be destabilizing and would result in data corruption on conversion. Incidentally, this is not the first time this exact issue has arisen. The Korean National Body requested encoding of a new character SQUARE WITH DOTS for the same remapping purpose in 2010. After discussion in WG2 about the consequences of such a change, the Korean National Body withdrew their request. (See the WG2 minutes of that meeting in WG2 N3903 and the Final Disposition of Comments in WG2 N3936.) References to UTC Minutes: [134-C7], January 28, 2013.

2013-1 US FLAG SYMBOL 2013-Feb-06

Proposal to encode a character for a US flag symbol. (See L2/12-094.)

Disposition: The UTC rejected the proposal. The mapping to an existing emoji symbol for the US flag is already possible by using pairs of regional indicator symbols. Additionally, the domain of flags is generally not amenable to representation by encoded characters, and the UTC does not wish to entertain further proposals for encoding of symbol characters for flags, whether national, state, regional, international, or otherwise. References to UTC Minutes: [134-C2], January 28, 2013.

2012-2 EXTERNAL LINK SIGN 2012-June-06

Proposals to encode a character for the "external link sign", which is often seen as a graphic element indicating a link to a document located external to the website where the page using the external link sign resides. (See L2/06-268, L2/12-143, L2/12-169.)

Disposition: The UTC rejected the proposals to add "external link sign", most recently in L2/12-169. It is unclear that the entity in question is actually an element of plain text, given the inevitable connection to its function in linking to other documents, and thus its coexistence with markup for links. Furthermore, the existing widespread practice of representing this sign on web pages using images (often specified via CSS styles) would be unlikely to benefit from attempting to encode a character for this image. (This notice of non-approval should not be construed as precluding alternate proposals which might propose encoding a simple shape-based symbol or symbols similar in appearance to the images used for external link signs, should an appropriate plain-text argument for the need to encode such a simple graphic symbol be forthcoming.) References to UTC Minutes: [131-C26], May 10, 2012.

2012-1 TAMIL SCRIPT RE-ENCODING 2012-March-05

Proposal to re-encode the Tamil script on different principles, encoding syllables and pulli consonants as atomic characters. (See L2/12-033.)

Disposition: The existing encoding of the Tamil script is widely deployed in successful implementations. No convincing evidence exists of any grave deficiency in the existing encoding that would present an insurmountable obstacle to its use in representing Tamil. On the contrary, re-encoding any script would not only be damaging to existing implementations but lead to data corruption, incompatibilities in interchange and user confusion. The UTC rejected this proposal and will not entertain further requests for re-encoding of the Tamil script. References to UTC Minutes: [130-M2], February 7, 2012.

2011-2 HEXADECIMAL DIGITS 2011-May-24

Proposal to encode "A".."F" for displaying hexadecimal digits. (See L2/03-386)

Disposition: The UTC turned down this proposal as duplicate encoding. "A".."F" are already encoded as U+0041..U+0046, and hexadecimal digits are already universally implemented using those characters or their lowercase forms, U+0061..U+0066. Separately encoding "A".."F" based on their function as hexadecimal digits would only disrupt existing implementations and introduce ambiguity into the representation of hexadecimal digits. References to UTC Minutes: [127-C6], May 10, 2011.
See also the FAQ on hexadecimal digits.

2011-1 SUBSCRIPT SOLIDUS 2011-May-24

Proposal to encode subscript solidus as a modifier letter.

Disposition: The UTC did not agree that the exemplified usage constituted plain text or required encoding as a modifier letter. The fact that some commercial software modules can only handle plain text is insufficient argument for claiming that any particular printed superscripted or subscripted character must be encoded as a separate character. The soldius is already encoded. The character in question is simply a solidus shown in a subscripted expression, which can be represented via markup or styling. References to UTC Minutes: [126-C8], February 8, 2011.

2010-1 FLORIN CURRENCY SYMBOL 2010-May-11

Proposal to disunify the existing U+0192 LATIN SMALL LETTER F WITH HOOK, currently serving as the Florin currency symbol, and instead encode a separate currency symbol.

Disposition: 2010-May-11, rejected by the UTC as disruptive of too much existing data and too many existing mappings and implementations. For a currency symbol of mostly historical interest, the disunification was considered too problematic to undertake. References to UTC Minutes: [123-C19]

2008-2 TELUGU SIGN ARDHAVISARGA 2008-May-14

Proposal to encode the Vedic ardhavisarga sign as a Telugu script character. L2/06-250.

Disposition: The originally approved U+0C71 TELUGU SIGN ARDHAVISARGA (2006-Aug-11) was reproposed as a generic Vedic sign. On 2008-May-14, the UTC superseded its earlier approval, and instead approved the generic Vedic sign as U+1CF2 VEDIC SIGN ARDHAVISARGA. References to UTC Minutes: [115-C6]

2008-1 AVESTAN SEPARATION POINT 2008-Feb-08

Proposal to encode a middle dot as a separation point for Avestan. L2/07-006.

Disposition: 2008-Feb-08, rejected by the UTC as a duplicate of U+2E31 WORD SEPARATOR MIDDLE DOT. This character had progressed to ballot for Amd 5 to ISO/IEC 10646:2003 (as U+10B38), but was removed from that ballot by disposition of comments, 2008-Apr-25. References to UTC Minutes: [114-A76]

2006-1 MALAYALAM CONSONANT SIGN CILLU 2006-Nov-08

Proposal to encode a separate cillu sign for use as a diacritic for other Malayalam letters. L2/06-261.

Disposition: 2006-Nov-08, rejected by the UTC on architectural grounds, as inconsistent with the decision to encode atomic chillu characters for Malayalam. References to UTC Minutes: [109-A74]

2004-5 Capital Double S 2007-May-18

Proposal to encode a Capital Double S for German. L2/04-395.

Disposition: 2004-Nov-18, rejected by the UTC as a typographical issue, inappropriate for encoding as a separate character. Rejected also on the grounds that it would cause casing implementation issues for legacy German data. Decision later revisited 2007-May-18, on the basis of a revised proposal, L2/07-108. Now standardized as U+1E9E LATIN CAPITAL LETTER SHARP S in Unicode 5.1. References to UTC Minutes: [101-C22], [101-A74], [111.M1]

2004-4 MODIFIER LETTER STRAIGHT APOSTROPHE 2006-Nov-10

Proposal to encode an unambiguously straight form of the modifier letter apostrophe, for use in Latin script orthographies. L2/04-372.

Disposition: 2004-Nov-18, rejected by the UTC as not distinct from existing encoded characters for modifier letter apostrophes. Decision later overturned 2006-Nov-10, on the basis of a revised proposal, L2/06-259, to encode a casing pair of letters. These were standardized as U+A78B LATIN CAPITAL LETTER SALTILLO and U+A78C LATIN SMALL LETTER SALTILLO in Unicode 5.1. References to UTC Minutes: [101-A87]

2004-3 Combining Umlaut 2004-June-18

Proposal to encode a combining umlaut character, distinct from U+0308 COMBINING DIAERESIS. L2/04-210.

Disposition: 2004-June-18, rejected by the UTC as an inappropriate disunification of the existing character. Functional disunification of combining marks which otherwise appear identical in appearance is inappropriate, and other mechanisms to maintain this functional distinction in text are available. Proposal was submitted to WG2, 2004-June 21, but never progressed. References to UTC Minutes: [99-C35], [99-M6]

2004-2 Roman Canopy Character 2004-June-18

Proposal to encode a character to represent the "canopy" mark over Roman numerals in classical Latin text. L2/04-137.

Disposition: 2004-June-18, rejected by the UTC as inappropriate for encoding as a character. This kind of textual convention should be represented by markup, instead. Proposal was submitted to WG2, 2004-June 21, but never progressed. References to UTC Minutes: [99-A46]

2004-1 Ideographic Square Symbols 2004-June-18

Proposal to encode two square symbols for use with ideographs: IDEOGRAPHIC WHITE SQUARE and IDEOGRAPHIC BLACK SQUARE. L2/04-029.

Disposition: 2004-June-18, rejected by the UTC as duplicates of the existing U+25A1 WHITE SQUARE and U+25A0 BLACK SQUARE. Proposal was submitted to WG2, 2004-June-21, but never progressed. References to UTC Minutes: [99-A53]

2001-3 Klingon Script 2001-May-21

Proposal to encode the Klingon script. L2/97-273, L2/01-212.

Disposition: 2001-May-21, rejected by the UTC as inappropriate for encoding, for multiple reasons stated in L2/01-212. (Lack of evidence of usage in published literature, lack of organized community interest in its standardization, no resolution of potential trademark and copyright issues, question about its status as a cipher rather than a script, and so on.) References to UTC Minutes: [87-M3], [87-A15]

2001-2 KHMER SIGN LAAK 2001-Jan-31

Proposal to encode one sign for the Khmer script, KHMER SIGN LAAK, proposed for U+17DD.

Disposition: 2001-Jan-31, rejected by the UTC as being just a glyph variant of the existing encoded character U+17D8 KHMER SIGN BEYYAL. The preferred representation of this sign (and its alternates) is by spelling it out explicitly. The proposal was submitted to WG2, but ceased progression on 2000-Sep-25 at ISO Stage 2. References to UTC Minutes: [86-M17], [86-A35]

2001-1 GEORGIAN LETTER U-BRJGU 2001-Jan-31

Proposal to encode one precomposed Georgian letter.

Disposition: 2001-Jan-31, rejected by the UTC as a precomposed letter already represented by the sequence <U+10E3, U+0302>. References to UTC Minutes: [86-M22], [86-A52]

2000-1 Ligature Control Characters 2000-Feb-03

Proposal to encode two ligature control characters, ZERO WIDTH LIGATOR and ZERO WIDTH NONLIGATOR. L2/99-379, L2/00-012, L2/00-025, L2/00-031.

Disposition: 2000-Feb-03, rejected by the UTC on architectural grounds. The UTC assessment was that no forced control of ligation was feasible, and instead decided to fully document the use of the existing ZWJ and ZWNJ with regard to ligature formation. References to UTC Minutes: [82-M17], [82-M18]

1998-2 SOFT SPACE 1998-Dec-01

Proposal to encode a conditional space, to be used in line breaking or text justification in scripts, such as Khmer, which do not use regular spaces to delimit words. L2/98-373.

Disposition: On 1998-Dec-01, the UTC rejected this solution as an inappropriate duplication of the intended use of U+200B ZERO WIDTH SPACE in these scripts. The UTC subsequently addressed the issue by clarification of the line breaking and justification behavior of U+200B. References to UTC Minutes: [78-M1]

1998-1 Ecological Symbols 1998-Feb-26

Proposal to encode two symbols related to recycling: RECYCLE SIGN and DER GRUENE PUNKT. L2/98-025.

Disposition: 1998-Feb-26, rejected by the UTC because of the status of DER GRUENE PUNKT as a trademarked logo. The other symbol was later accepted and standardized as U+2672 UNIVERSAL RECYCLING SYMBOL in Unicode 3.2.

1997-3 MODIFIER LETTER MIDDLE DOT 1997-Dec-05

Proposal to encode a middle dot character to function as a modifier letter.

Disposition: 1997-Dec-05, withdrawn by author.

1997-2 Mid-level Hamzah 1997-Jul-22

Proposal to encode a mid-level hamzah character for the Arabic script.

Disposition: 1997-Jul-22, withdrawn by author.

1997-1 Phaistos Disc 2006-May-19

Proposal to encode 45 characters for the Phaistos Disc "script". L2/97-106.

Disposition: 1997-May-29, proposal not accepted by the UTC, in part because of questions about the identity of the characters in question as a script. A revised proposal for encoding 46 characters as pictographic symbols without a specific claim of them being letters of a script in the Unicode sense was accepted by the UTC on 2006-May-19 and was standardized in Unicode 5.1.

1996-3 Arabic Presentation Forms for Uighur 1996-Dec-06

Proposal to encode a large number of Arabic presentation forms for Arabic letters used in the Uighur, Kazakh, and Kirghiz languages.

Disposition: 1996-Dec-06, rejected by the UTC on architectural grounds. Encoding of positional variant glyphs for Arabic letters as characters is not required for correct rendering support, and cannot be justified by analogy to earlier sets of Arabic presentations forms encoded as compatibility characters.

1996-2 Yoruba Precomposed Latin Letters 1996-Sep-07

Proposal to encode 14 precomposed Latin letters, for use in representing the Yoruba language.

Disposition: 1996-Sep-07, rejected by the UTC as precomposed letters already represented by encoded letters plus combining marks.

1996-1 Armenian Punctuation Characters 1997-Jul-04

Proposal to encode 15 Armenian script-specific punctuation characters.

Disposition: 1996-Mar-06, rejected by the UTC on the grounds that most of the proposed characters were unnecessary disunifications of already-encoded general punctuation characters. 1997-Jul-04, proposal stopped progress in WG2 at Stage 2. One character was accepted and later standardized as U+058A ARMENIAN HYPHEN.