Unicode Technical Reports cover a wide range of
topics related to the implementation or development of the
Unicode Standard.
These include but are not limited to:
- normalizing Unicode text for comparison and storage
- compressing Unicode text, for storage comparable to that
of legacy encodings
- collating (sorting) strings
- linebreaking text
- designing regular expressions
These reports are normatively referenced by a
number of international standards and by a wide range of
products. For more information on other Unicode Specifications, see the
Specifications FAQ.
Types of Unicode Technical Reports: UTR, UTS,
UAX
There are three types of technical reports,
based on the authoritative status of the document:
A Unicode Standard Annex (UAX) forms an integral part
of the Unicode Standard, but is published online as a separate
document. The Unicode Standard may require conformance to
normative content in a Unicode Standard Annex, if so specified
in the Conformance chapter of that version of the Unicode
Standard. The version number of a UAX document corresponds to
the version of the Unicode Standard of which it forms a part.
A Unicode Technical Standard (UTS) is an independent
specification. Conformance to the Unicode Standard does not
imply conformance to any UTS.
A Unicode Technical Report (UTR) contains informative
material. Conformance to the Unicode Standard does not imply
conformance to any UTR. Other specifications, however, are free
to make normative references to a UTR.
As technical reports, including UAXs and UTSs, are developed, the Unicode Technical
Committee approves the posting of preliminary versions or
proposed updates for
public review. Publication of these draft versions does not
imply endorsement by the Unicode Consortium.
A Draft Unicode Technical Report (DUTR)
has the basic structure and content required, but still is not
ready for final approval.
A
Proposed Draft Unicode Technical Report (PDUTR)
is in the earliest stages of development.
A Proposed Update of a UTR, UAX or UTS
contains the draft of a proposed modification of an approved
UTR, UAX or UTS.
Any technical report that is
Draft, Proposed Draft,
or Proposed Update is a preliminary document which may be updated,
replaced, or superseded by other documents at any time. Such
documents are not stable specifications; it is inappropriate to cite them
as other than works in progress.
Technical reports are created by the Unicode Consortium Technical
Committees (UTC and
CLDR-TC) in a open,
consensus-oriented process.
For more information about the approval process,
see the FAQ on the Technical
Reports Development Process. When
appropriate, the Public
Review Issues page solicits review and feedback on updates
to technical reports.
One UTS, the
Unicode Collation Algorithm, has
an additional set of policies governing the maintenance
of the basic data table used in assigning collation weights
to characters. See
Change Management for the Unicode
Collation Algorithm and
UCA Default Table
Criteria for New Characters.
Uniform and persistent revision numbers are used
for all technical reports, even in the case when a UTR changes its status to a UAX or a UTS.
This revision number is incremented and a new URL is provided
each time the file is altered materially. Modifications to the
report are summarized in a change history. Certain minor
structural corrections of the HTML source, for example to fix
broken links, may be made without a new revision number. In that
case the date in the report header will be updated.
Each file has links to the
previous approved version of the report and to the latest
approved version of the report, allowing readers to find and
cite particular versions.
- For a UAX, the version number is of the form “Unicode
4.0.1” to reflect the fact that it
is part of a given version of the Unicode Standard.
- Some UTSs use fractional version numbers to distinguish
minor updates of the documents from major changes in the
specification. Where a version of a UTS is synchronized to a
version of the Unicode Standard, its version number
corresponds to the version number of the standard, with an
optional fourth version field for intermediate subversions
of the UTS.
- For all other UTRs and UTSs, the version number of the
report is identical to the revision number.
Because revision numbers use whole
numbers, rather than a major.minor.update version syntax,
some UTRs have rather large version numbers. Many
of the revision number changes, however, reflect
rather minor editorial changes to the documents, as
opposed to substantive changes to their contents.
UTSs containing data files that depend on the
repertoire of the Unicode Standard may be revised for each
repertoire change of the standard (minor version). Unless
necessary, they are not revised for update versions of the
Unicode Standard.
For more information about citing versions of technical
reports, see Versions
of the Unicode Standard.
The UAXs, UTSs, and some of the UTRs have stable HTML anchors defined for section headers. These enable direct links to those sections. In the more recent versions of technical reports, stable HTML anchors are also provided for tables, figures, each formal rule or definition, and the modification history of the document. For example:
Occasionally, the material of a report is incorporated into another document, for
example UAX #13 Newline Guidelines became Section 5.8, Newline Guidelines
in the Core Specification of the Unicode Standard as of Version 4.0.
Such reports are considered superseded and are listed in their own
section on
the Technical Reports page.
Instead of a link to
the latest approved version, the reports page
has a link to a page explaining the change in status (Example).
Where applicable, that page also provides information on where the material was incorporated.
A similar page is provided for reports
which have been formally withdrawn. To prevent confusion about numbering of technical reports, the numbers of superseded or withdrawn reports are never reused.
Old Versions of Unicode
For several early versions of Unicode, the text of the
specification was published initially as a Unicode Technical
Report or even as a Unicode Standard Annex. Such reports are listed in their own
section on
the Technical Reports page. Instead of a link to the latest approved
version, the reports page has a link to a page providing a summary for
that version of the Unicode Standard. (Example)
Data files for UAXs are maintained in the
Unicode Character Database (UCD).
These data files are versioned with the same three level version
number as for the Unicode Standard.
Data files for UTSs or UTRs are
maintained in separate, versioned folders under
http://www.unicode.org/Public/. The location of each set of
data files is documented in the corresponding UTS or UTR. Each folder
contains a complete set of data files for that version of the
document.
On occasion, errata to technical reports and other specifications are
posted on the Updates and Errata page. To report errors in
published documents, such as the Unicode Standard itself or technical
reports, you may use the Unicode Consortium's contact form.