Unicode Technical Reports cover a wide range of
topics related to the implementation or development of the
Unicode Standard.
These include but are not limited to:
- normalizing Unicode text for comparison and storage
- compressing Unicode text, for storage comparable to that
of legacy encodings
- collating (sorting) strings
- linebreaking text
- performing
folding operations
- designing regular expressions
These reports are normatively referenced by a
number of international standards and by a wide range of
products. For more information on other Unicode Specifications, see the
Specifications FAQ.
Types of Unicode Technical Reports: UTR, UTS,
UAX
There are three types of technical reports,
based on the authoritative status of the document:
A Unicode Standard Annex (UAX) forms an integral part
of the Unicode Standard, but is published online as a separate
document. The Unicode Standard may require conformance to
normative content in a Unicode Standard Annex, if so specified
in the Conformance chapter of that version of the Unicode
Standard. The version number of a UAX document corresponds to
the version of the Unicode Standard of which it forms a part.
A Unicode Technical Standard (UTS) is an independent
specification. Conformance to the Unicode Standard does not
imply conformance to any UTS.
A Unicode Technical Report (UTR) contains informative
material. Conformance to the Unicode Standard does not imply
conformance to any UTR. Other specifications, however, are free
to make normative references to a UTR.
As technical reports, including UAXs and UTSs, are developed, the Unicode Technical
Committee approves the posting of preliminary versions or
proposed updates for
public review. Publication of these draft versions does not
imply endorsement by the Unicode Consortium.
A Draft Unicode Technical Report (DUTR)
has the basic structure and content required, but still is not
ready for final approval.
A
Proposed Draft Unicode Technical Report (PDUTR)
is in the earliest stages of development.
A Proposed Update of a UTR, UAX or UTS
contains the draft of a proposed modification of an approved
UTR, UAX or UTS.
Any technical report that is
Draft, Proposed Draft,
or Proposed Update is a preliminary document which may be updated,
replaced, or superseded by other documents at any time. Such
documents are not stable specifications; it is inappropriate to cite them
as other than works in progress:
Earlier approved technical reports may be
incorporated into The Unicode Standard or superseded by newer
Unicode technical reports. These are marked as Superseded.
However, archival versions of all approved versions of all
technical reports either are maintained on the Unicode web site
or are available from the Unicode Consortium in paper copy upon
request.
Certain technical reports were withdrawn before
ever being approved. Such withdrawn technical reports are not
available. To prevent confusion, their numbers are not reused.
Technical reports are created by the Unicode Consortium Technical
Committees (UTC and
CLDR-TC) in a open,
consensus-oriented process.
For more information about the approval process,
see the FAQ on the Technical
Reports Development Process. When
appropriate, the Public
Review Issues page solicits review and feedback on updates
to technical reports.
One UTS, the
Unicode Collation Algorithm, has
an additional set of policies governing the maintenance
of the basic data table used in assigning collation weights
to characters. See
Change Management for the Unicode
Collation Algorithm and
UCA Default Table
Criteria for New Characters.
Uniform and persistent revision numbers are used
for all technical reports, even in the case when a UTR changes its status to a UAX or a UTS.
This revision number is incremented and a new URL is provided
each time the file is altered materially. Modifications to the
report are summarized in a change history. Certain minor
structural corrections of the HTML source, for example to fix
broken links, may be made without a new revision number. In that
case the date in the report header will be updated.
Each file has links to the
previous approved version of the report and to the latest
approved version of the report, allowing readers to find and
cite particular versions.
- For a UAX, the version number is of the form “Unicode
4.0.1” to reflect the fact that it
is part of a given version of the Unicode Standard.
- Some UTSs use fractional version numbers to distinguish
minor updates of the documents from major changes in the
specification. Where a version of a UTS is synchronized to a
version of the Unicode Standard, its version number
corresponds to the version number of the standard, with an
optional fourth version field for intermediate subversions
of the UTS.
- For all other UTRs and UTSs, the version number of the
report is identical to the revision number.
Because revision numbers use whole
numbers, rather than a major.minor.update version syntax,
some UTRs have rather large version numbers. Many
of the revision number changes, however, reflect
rather minor editorial changes to the documents, as
opposed to substantive changes to their contents.
UTSs containing data files that depend on the
repertoire of the Unicode Standard may be revised for each
repertoire change of the standard (minor version). Unless
necessary, they are not revised for update versions of the
Unicode Standard.
For more information about citing versions of technical
reports, see Versions
of the Unicode Standard.
Data files for UAXs are maintained in the
Unicode Character Database (UCD).
These data files are versioned with the same three level version
number as for the Unicode Standard. Data files for UTSs are
maintained in separate, versioned folders under
http://www.unicode.org/Public/. The location of each set of
data files is documented in the corresponding UTS. Each folder
contains a complete set of data files for that version of the
UTS.
Reports of errors in
published documents, such as the Unicode Standard itself or technical
reports, may be filed using the Unicode Consortium's online form. If confirmed, and
depending on the nature of the reported error, an erratum may be posted, to be fixed in subsequent editions of the Standard.