[Unicode]  Unicode 6.3.0 Home | Site Map | Search
 

Unicode 6.3.0 (In preparation)

Unicode 6.3.0 is under preparation as a minor version of the Unicode Standard and will supersede all previous versions. This page summarizes the important changes for the Unicode Standard, Version 6.3.0. In the discussion below, Version 6.3.0 may be abbreviated as "Unicode 6.3" or "Version 6.3."

 

Note that the Core Specification is unchanged from Version 6.2 for Version 6.3. As a consequence, links to the chapters of the Core Specification resolve to the same pdf files as for Version 6.2.

 


Contents of This Document

A. Summary
B. Version Information
C. Stability Policy Update
D. Textual Changes and Character Additions
E. Conformance Changes
F. Changes in the Unicode Character Database
G. Changes in the Unicode Standard Annexes
H. Changes in Synchronized Unicode Technical Standards

A. Summary

Version 6.3 of the Unicode Standard is a special release focused on the update of the Unicode Bidirectional Algorithm. This version also rolls in various minor corrections for errata and other small updates for the Unicode Character Database.

Two other important Unicode specifications are maintained in synchrony with the Unicode Standard, and have updates for Version 6.3:

This version of the Unicode Standard is synchronized with ISO/IEC 10646:2012, plus the accelerated publication of 5 bidirectional format control characters: U+061C ARABIC LETTER MARK and the isolate span controls U+2066..U+2069.

See Sections D through H below for additional details regarding the changes in this version of the Unicode Standard, its associated annexes, and the other synchronized Unicode specifications.

B. Version Information

Version 6.3 of the Unicode Standard consists of the core specification (unchanged from Version 6.2), the delta and archival code charts for this version, the Unicode Standard Annexes, and the Unicode Character Database (UCD).

The core specification gives the general principles, requirements for conformance, and guidelines for implementers. The code charts show representative glyphs for all the Unicode characters. The Unicode Standard Annexes supply detailed normative information about particular aspects of the standard. The Unicode Character Database supplies normative and informative data for implementers to allow them to implement the Unicode Standard.

Version 6.3.0 of the Unicode Standard should be referenced as:

The Unicode Consortium. The Unicode Standard, Version 6.3.0, (Mountain View, CA: The Unicode Consortium, 2013. ISBN 978-1-936213-08-5)
http://www.unicode.org/versions/Unicode6.3.0/

A complete specification of the contributory files for Unicode 6.3 is found on the page Components for 6.3.0. That page also provides the recommended reference format for Unicode Standard Annexes.

The navigation bar on the left of this page provides links to both the core specification as a single file, as well as to individual chapters, and the appendices. Also provided are links to the code charts, the radical-stroke indices to CJK ideographs, the Unicode Standard Annexes and the data files for Version 6.3 of the Unicode Character Database. The core specification (Version 6.2) is also available in a print-on-demand version.

Code Charts

Several sets of code charts are available. They serve different purposes:

  • The latest set of code charts for the Unicode Standard are available online. Those charts are always the most current code charts available, and may be updated at any time. The charts are organized by scripts and blocks for easy reference. An online index by character name is also provided.

For Unicode 6.3.0 in particular two additional sets of code chart pages are provided:

  • A set of delta code charts showing the blocks in which bidirectional format controls were added for Unicode 6.3.0. Those characters are visually highlighted in the relevant chart. These delta code charts also include blocks which contain significant glyph changes to fix errata.
  • A set of archival code charts that represent the entire set of characters, names and representative glyphs at the time of publication of Unicode 6.3.0.

The delta and archival code charts are a stable part of this release of the Unicode Standard. They will never be updated.

Errata

Errata incorporated into Unicode 6.3 are listed by date in a separate table. For corrigenda and errata after the release of Unicode 6.3, see the list of current Updates and Errata.

C. Stability Policy Update

The statement of the stability policy for the Bidi_Class property was slightly reworded to clarify the exact type of changes allowed for it. This update is related to the changes in Unicode 6.3.0 for the Unicode Bidirectional Algorithm. See Property Value Stability.

A new constraint was added to guarantee that characters with the General_Category property value Number also have a Numeric_Type property value distinct from None. See Property Value Stability.

Note: The Unicode Character Encoding Stability Policy restricts possible future changes to the Unicode Standard, but is not formally a part of the standard itself.

D. Textual Changes and Character Additions

The text for the Core Specification is unchanged for this version. Textual changes are limited to the Unicode Standard Annexes.

Character Assignment Overview

Five new character assignments were made in the BMP for the Unicode Standard, Version 6.3. This addition brings the total number of characters assigned in the standard to 110,122. (That is the traditional count, which totals up graphic and format characters, but omits surrogate code points, ISO control codes, noncharacters, and private-use allocations.)

No new blocks are defined in Version 6.3.

E. Conformance Changes

There are no significant conformance changes in the core specification. However, there are significant changes to the Unicode Bidirectional Algorithm in UAX #9.

F. Changes in the Unicode Character Database

The detailed listing of all changes to the contributory data files of the Unicode Character Database for Version 6.3 can be found in UAX #44, Unicode Character Database.

TBD

G. Changes in the Unicode Standard Annexes

In Version 6.3, many of the Unicode Standard Annexes have had significant revisions. The most important of these changes are listed below. For the full details of all changes, see the Modifications section of each UAX, linked directly from the following list of UAXes.

Unicode Standard Annex Changes
UAX #9
Unicode Bidirectional Algorithm
The Unicode Bidirectional Algorithm was substantially extended to support isolate runs and to resolve paired brackets as a unit. For the former extension, four new Bidi_Class property values were added. For the latter, two normative properties and an algorithm rule N0 were introduced. Additional definitions, rule revisions, notes, and examples were included, and a new test file was added.
UAX #11
East Asian Width
No significant changes in this version.
UAX #14
Unicode Line Breaking Algorithm
The description of the CM class was updated to reflect a refinement in line breaking for U+3035 VERTICAL KANA REPEAT MARK LOWER HALF, and the description of the BA class was updated to reflect a change for U+3000 IDEOGRAPHIC SPACE.
UAX #15
Unicode Normalization Forms
No significant changes in this version.
UAX #24
Unicode Script Property
No significant changes in this version.
UAX #29
Unicode Text Segmentation
There were some minor updates made for word segmentation. Colon was removed from MidLetter, so that it is no longer contained with words. (Swedish word boundary determination can be handled by tailoring.) Apostrophe and double quote are now allowed within a strictly Hebrew word context, to reflect their common use in place of geresh and gershayim.
UAX #31
Unicode Identifier and Pattern Syntax
No significant changes in this version.
UAX #34
Unicode Named Character Sequences
No significant changes in this version.
UAX #38
Unicode Han Database (Unihan)
The status of kCompatibilityVariant was clarified. kHanyuPinlu was changed to use accents instead of numbers for tones, and the regular expression for it was modified accordingly. Many other minor documentation updates were made.
UAX #41
Common References for Unicode Standard Annexes
Minor updates were made to the references.
UAX #42
Unicode Character Database in XML
Changes were made to track additional properties and property values for the Unicode Bidirectional Algorithm.
UAX #44
Unicode Character Database
The status of default values was clarified. Numerous changes were made to reflect changes to the Unicode Bidirectional Algorithm and its associated character properties and data files. A clarification was added about Numeric_Type=Digit.
UAX #45
U-Source Ideographs
283 characters were added to the list of U-Source ideographs. A new status of UNC-2013 was added and documented.

H. Changes in Synchronized Unicode Technical Standards

There are also significant revisions in the Unicode Technical Standards whose versions are synchronized with the Unicode Standard. The most important of these changes are listed below. For the full details of all changes, see the Modifications section of each UTS, linked directly from the following list of UTSes.

Unicode Technical Standard Changes
UTS #10
Unicode Collation Algorithm
The CLDR root collation data files contained in CollationAuxiliary.zip, along with the related documentation, have been moved from the UCA release directory to the CLDR repository. [link needed] Trailing collation elements are now given regular tertiary weights in DUCET, which allows for full case differences among compatibility characters. The IgnoreSP option for handling variables (intended for ignoring symbols but not punctuation) has been removed. The weights 0xFFFD..0xFFFF are now reserved for special collation elements. In addition, the text of UTS #10 has been reorganized for better flow.
UTS #46
Unicode IDNA Compatibility Processing
The five new bidirectional format controls were added. They have the status of disallowed.