Unicode 6.3.0 (In preparation)
Unicode 6.3.0 is under preparation as a
minor version of the Unicode Standard and will supersede all previous versions.
This page summarizes the important changes for the Unicode Standard, Version 6.3.0.
In the discussion below, Version 6.3.0 may be abbreviated as "Unicode 6.3" or "Version 6.3."
| Note that the Core Specification is unchanged from Version 6.2 for Version 6.3. As a consequence,
links to the chapters of the Core Specification resolve to the same pdf files as for Version 6.2. |
A. Summary
B. Version Information
C. Stability Policy Update
D. Textual Changes and Character Additions
E. Conformance Changes
F. Changes in the Unicode Character Database
G. Changes in the Unicode Standard Annexes
H. Changes in Synchronized Unicode Technical Standards
Version 6.3 of the Unicode Standard is a special release focused on the update of the Unicode
Bidirectional Algorithm. This version also rolls in various minor corrections for errata and
other small updates for the Unicode Character Database.
Two other important Unicode specifications are maintained in synchrony with the Unicode Standard, and have updates for
Version 6.3:
This version of the Unicode Standard is synchronized with ISO/IEC 10646:2012,
plus the accelerated publication of 5 bidirectional format control characters: U+061C ARABIC LETTER MARK
and the isolate span controls U+2066..U+2069.
See Sections D through H below for additional details regarding the changes in this version of
the Unicode Standard, its associated annexes, and the other synchronized Unicode specifications.
Version 6.3 of the Unicode Standard consists of the core specification (unchanged from
Version 6.2),
the delta and archival code charts for this version, the Unicode Standard Annexes, and
the Unicode Character Database (UCD).
The core specification gives the general principles,
requirements for conformance, and guidelines for implementers. The
code charts show representative glyphs for all the Unicode
characters. The Unicode Standard Annexes supply detailed normative
information about particular aspects of the standard. The Unicode
Character Database supplies normative and informative data for
implementers to allow them to implement the Unicode Standard.
Version 6.3.0 of the Unicode Standard
should be referenced as:
The Unicode Consortium. The Unicode Standard, Version 6.3.0, (Mountain View, CA: The Unicode Consortium,
2013. ISBN 978-1-936213-08-5)
http://www.unicode.org/versions/Unicode6.3.0/
A complete specification of the contributory files for Unicode
6.3 is found on the page
Components for 6.3.0.
That page also provides the recommended reference format for Unicode Standard Annexes.
The navigation bar on the left of this page provides links to
both the core specification as a single file,
as well as to individual chapters, and
the appendices.
Also provided are links to the
code charts, the
radical-stroke indices to CJK
ideographs, the Unicode Standard Annexes
and the data files for Version 6.3 of the Unicode Character Database.
The core specification (Version 6.2) is also available in a
print-on-demand version.
Several sets of code charts are available. They serve different
purposes:
- The latest set of code charts for the Unicode Standard are available online. Those charts are always the most current code charts available, and may be updated at any time. The charts are organized by scripts and blocks for easy reference. An online index by character name is also provided.
For Unicode 6.3.0 in particular two additional sets of code chart pages are provided:
- A set of delta code charts showing the
blocks in which bidirectional format controls were added for Unicode 6.3.0. Those characters are visually highlighted in the relevant chart.
These delta code charts also include blocks which contain significant glyph changes to fix errata.
- A set of archival code charts that represent
the entire set of characters, names and representative glyphs at the time of publication of Unicode 6.3.0.
The delta and archival code charts are a stable part of this release of the Unicode Standard. They will never be updated.
Errata incorporated into Unicode 6.3 are listed by date in
a separate table. For corrigenda and errata after the release of Unicode 6.3, see the list of current
Updates and Errata.
The statement of the stability policy for the Bidi_Class property was slightly reworded
to clarify the exact type of changes allowed for it. This update is related to the
changes in Unicode 6.3.0 for the Unicode Bidirectional Algorithm. See
Property Value Stability.
A new constraint was added to guarantee that characters with the General_Category
property value Number also have a Numeric_Type property value distinct from None. See
Property Value Stability.
Note: The Unicode Character Encoding Stability Policy restricts possible future changes to the Unicode Standard, but is not formally a part of the standard itself.
The text for the Core Specification is unchanged for this version. Textual changes are limited to the Unicode
Standard Annexes.
Character Assignment Overview
Five new character assignments were made in the BMP for the Unicode Standard, Version 6.3.
This addition brings the total number of characters assigned in the standard to 110,122.
(That is the traditional count, which totals up graphic and format characters, but
omits surrogate code points, ISO control codes, noncharacters, and private-use allocations.)
No new blocks are defined in Version 6.3.
There are no significant conformance changes in the core specification.
However, there are significant changes to the Unicode Bidirectional Algorithm in
UAX #9.
The detailed listing of all changes to the contributory data files of the Unicode Character Database
for Version 6.3 can be found in
UAX #44, Unicode Character Database.
TBD
In Version 6.3, many of the Unicode Standard Annexes have had significant revisions. The most important of these changes are listed below. For the full details of all changes, see the Modifications section
of each UAX, linked directly from the following list of UAXes.
|
Unicode Standard Annex |
Changes |
UAX #9 Unicode Bidirectional Algorithm
|
The Unicode Bidirectional Algorithm was substantially extended to support isolate runs
and to resolve paired brackets as a unit. For the former extension, four new Bidi_Class
property values were added. For the latter, two normative properties and an algorithm
rule N0 were introduced. Additional definitions, rule revisions, notes, and
examples were included, and a new test file was added. |
UAX
#11 East Asian Width |
No significant changes in this version. |
UAX
#14 Unicode Line Breaking Algorithm |
The description of the CM class was updated to reflect a refinement in line breaking
for U+3035 VERTICAL KANA REPEAT MARK LOWER HALF, and the description of the BA
class was updated to reflect a change for U+3000 IDEOGRAPHIC SPACE. |
UAX
#15 Unicode Normalization Forms
|
No significant changes in this version. |
UAX
#24 Unicode Script Property
|
No significant changes in this version. |
UAX
#29 Unicode Text Segmentation |
There were some minor updates made for word segmentation. Colon was removed from
MidLetter, so that it is no longer contained with words. (Swedish word boundary determination
can be handled by tailoring.) Apostrophe and double quote are now allowed within
a strictly Hebrew word context, to reflect their common use in place of geresh and
gershayim. |
UAX
#31 Unicode Identifier and Pattern Syntax
|
No significant changes in this version. |
UAX
#34 Unicode Named Character Sequences |
No significant changes in this version. |
UAX
#38 Unicode Han Database (Unihan) |
The status of kCompatibilityVariant was clarified. kHanyuPinlu was changed
to use accents instead of numbers for tones, and the regular expression for
it was modified accordingly. Many other minor documentation updates were made. |
UAX
#41 Common References for Unicode Standard Annexes |
Minor updates were made to the references. |
UAX
#42 Unicode Character Database in XML |
Changes were made to track additional properties and property values for
the Unicode Bidirectional Algorithm. |
UAX
#44
Unicode Character Database |
The status of default values was clarified. Numerous changes were made to
reflect changes to the Unicode Bidirectional Algorithm and its associated
character properties and data files. A clarification was added about Numeric_Type=Digit. |
UAX
#45
U-Source Ideographs |
283 characters were added to the list of U-Source ideographs.
A new status of UNC-2013 was added and documented. |
There are also significant revisions in the Unicode Technical Standards whose
versions are synchronized with the Unicode Standard. The most important of these changes are listed below.
For the full details of all changes, see the Modifications section
of each UTS, linked directly from the following list of UTSes.
| Unicode Technical Standard |
Changes |
UTS #10 Unicode Collation Algorithm |
The CLDR root collation data files contained in CollationAuxiliary.zip, along with the related
documentation, have been moved from the UCA release directory to the CLDR repository. [link needed]
Trailing collation elements are now given regular tertiary weights in DUCET, which allows for full case
differences among compatibility characters. The IgnoreSP option for handling variables (intended for
ignoring symbols but not punctuation) has been removed. The weights 0xFFFD..0xFFFF are now
reserved for special collation elements. In addition, the text of UTS #10 has been reorganized
for better flow. |
UTS #46 Unicode IDNA Compatibility Processing |
The five new bidirectional format controls were added. They have the status of disallowed. |