Unicode Frequently Asked Questions

Submitting Successful Character and Script Proposals

Q: How do I write a successful proposal for some overlooked characters or for an entire script?

Before investing time in writing a proposal, you should first verify that the character or script is eligible and hasn’t already been proposed or rejected.

Q: How do I verify that a character or script is eligible for encoding?

Often a proposed character can be expressed as a sequence of one or more existing Unicode characters. Encoding the proposed character would be a duplicate representation, and is thus not suitable for encoding. (In any event, the proposed character would disappear when normalized.) For example, a g-umlaut character is not suitable for encoding, because it can already be expressed with the sequence <g, combining diaeresis>. For further information on such sequences see Where is my Character and the FAQ page Characters, Combining Marks.

Q: How do I know if a character has already been proposed or rejected?

Before proceeding, determine that each proposed addition is a character according to the definition given in the Unicode Standard and that the proposed addition does not already exist in the Standard. Consult the Proposing New Characters (Pipeline Table) page and the links on the About the Pipeline Table page to see if the character is already on track to be encoded, is under investigation, or has been rejected.

If the character or script is not listed, ask on the Unicode email list or contact the Unicode Consortium using the online contact form.

Q: What are the steps in proposing an eligible character for encoding?

The process for characters and scripts varies depending upon what is being proposed:

Proposal documents for review should be sent to docsubmit@unicode.org. If you have never submitted a proposal before, you may use our online contact form to inquire, and we will send you instructions to prepare or submit your proposal.

The page on Submitting New Characters or Scripts also has information about the formal process that may be required later in the consideration of a proposal.

Q: That page on Submitting New Characters or Scripts mentions a proposal summary form. What is that and do I need to fill it out now?

Successful UTC proposals eventually need to be considered by the ISO character encoding committee, SC2/WG2, which requires a proposal summary form. This form requests basic information about the submitter, the type of proposal and its justification in a standardized format.

However, the form is not actually required by the UTC to start considering a character encoding proposal, so you do not need to fill out that form at this point.

Q: What happens once I submit my proposal to the Unicode Technical Committee?

Your proposal will be assigned a document number. Your proposal will most likely referred to the appropriate group handling new proposal submissions (CJK & Unihan, Emoji Subcommittee, or Script Ad Hoc).  A member of the relevant group will contact you with feedback after the group has discussed it, and you may be asked to revise it. The group may recommend that it be scheduled for consideration for approval at an upcoming UTC meeting. These occur quarterly. A UTC member will contact you with feedback after the discussion if you are not there in person.

A proposal will go through a number of these review cycles before being accepted, while the authors make changes based on feedback received.

Q: What format should I use for my document?

A UTC document requires a simple header that gives a title, name of the submitter and the date in any style. Each page must have a footer or header that includes some identifying mark (such as author) and the page number.

Q: What do I need to include in my document?

A document requires supporting evidence, which is best incorporated into the document rather than as links. For new characters, provide images clearly showing the characters in use, with their glyph circled or clearly identified, along with a caption that describes the character and the source of the image. Most importantly your document needs to contain a clear proposal, with your specific suggestions such as, add this character with this glyph, this name and these properties to this block (you can leave code position open).

What character properties need to be included in my proposal?

  1. Properties typically supplied for all proposed characters include Script, Bidirectional_Class, combining class (ccc), case or decomposition mappings (if any) and the General_Category.

  2. If a proposed script has a clear analogue to an already encoded script, certain script-specific properties like Joining_Group or Indic_Syllabic_Category might be readily extended to the new script and your proposal should include suggested values.
  3. If your proposal contains new digits or other numeric characters, the numeric values for each need to be spelled out exactly.

In addition to providing suggested property values, your proposal needs to spell out the behavior of the proposed character or writing system in some detail, especially for punctuation. This allows verification that any suggested property values do in fact match the expected behavior. In addition, people familiar with the way the various Unicode algorithms make use of character properties can then contribute to ensure the correct property assignments.

Q: What character behaviors need to be described in my proposal?

The following list covers the core behaviors (where applicable) that need to be described:

  1. Rendering rules and directional behavior
  2. Collation and casing behavior
  3. Segmentation behavior (grapheme cluster, word break, line break)
  4. Any special behaviors (such as for notational systems)

The Script Ad Hoc will work with you to make sure that these are covered.

Q: Are there some examples of successful proposals I could look at?

Yes, for a successful proposal to encode a complex script, you might look at the Balinese script proposal. The Modi and Hanifi Rohingya proposals are good examples of how to include the requested property information. The Sharada Candrabindu, Glagolitic, Bosnian additions (Arabic script), or Four Sindhi characters for Devanagari proposals are good examples of how to propose additions of characters to an existing script. There’s also a discussion paper and Indic Number forms.

Other resources where you could find further examples include a range of proposals in the WG2 documents and a long list of good proposalsexternal link.

Q: How should I organize the information in my proposal so that the committee can evaluate it easily?

The best format is to start with the proposal, and follow it with the rationale and further background. Character proposals should provide a brief historical background on the character(s), a character code chart (if there are many characters being proposed), a list of properties, and a bibliography. If the proposal is requesting the addition of a number of characters to an existing encoded script, it is helpful to provide a proposed code chart showing both the existing characters and the requested characters, so they can be seen and evaluated together in context.

It is essential that proposals for additions to an existing script include samples showing the proposed characters in context. The proposed character should be highlighted in some way to make it easier for reviewers to understand the samples, and the source of the sample should be clearly identified. If substantial, this information may be in an appendix or a separate document, but it cannot be a link to some website: all parts of a proposal must be included in the submission, so that it can be archived as permanent part of the document register.

Q: Would a summary help my proposal?

Yes, a one-paragraph summary at the top can help the reader understand what the whole document is about, and will help the committee members (who are not familiar with all aspects of your issue) to put things into context.

Q: What else is involved besides submitting my proposal?

You will need to be actively involved in following up for your proposal to succeed. Also, you need to be prepared to make changes to your proposal in response to feedback from the group reviewing your proposal.

For complex or controversial proposals, it is extremely helpful to find someone attending the Unicode Technical Committee to champion your cause and help explain and move it along. Some complex proposals can require many meetings before they are accepted, and active involvement is required for their success. Of course, straightforward proposals also benefit from a champion in the UTC.

Q: What if I don’t need a new character, but want to change something else in the standard?

There are some limits on possible changes to the standard, so make sure that you first read the Unicode Character Encoding Stability Policy. Assuming your requested change is not prohibited, provide a document with a clear proposal and supporting evidence. It should contain specific suggestions, such as change the value of this property for this character from X to Y or change this text in the standard on a particular page or in a particular UAX to some other specified text. The Contact Form has links to report errors and to submit proposals to the Unicode Technical Committee Technical Committee (UTC) and other committees.

Q: What if I don’t have a solution to the problem?

If you have found a problem, but are not yet able to propose a solution, you might find it helpful to first discuss the issue on the Unicode Public E-Mail List to solicit ideas, so that your proposal can offer a solution. Pointing out a range of possible solutions can sometimes be very useful. It is best if these are presented as concrete alternatives, each fully spelled out. Sometimes it is also appropriate to submit commentary documents and problem statements. It is essential to collect as thorough documentation of the nature of the problem, where it is encountered and how it fits within the context of a script or notational system, even if you are unable to suggest a solution.

Q&A contributed by [JDA] & [AF]