[Unicode]   Technical Reports Home | Site Map | Search
 

About Unicode Technical Reports

Unicode Technical Reports cover a wide range of topics related to the implementation or development of the Unicode Standard.

 These include but are not limited to:

  • normalizing Unicode text for comparison and storage
  • compressing Unicode text, for storage comparable to that of legacy encodings
  • collating (sorting) strings
  • linebreaking text
  • designing regular expressions

These reports are normatively referenced by a number of international standards and by a wide range of products. For more information on other Unicode Specifications, see the Specifications FAQ.

Types of Unicode Technical Reports: UTR, UTS, UAX

There are three types of technical reports, based on the authoritative status of the document:

A Unicode Standard Annex (UAX) forms an integral part of the Unicode Standard, but is published online as a separate document. The Unicode Standard may require conformance to normative content in a Unicode Standard Annex, if so specified in the Conformance chapter of that version of the Unicode Standard. The version number of a UAX document corresponds to the version of the Unicode Standard of which it forms a part.

A Unicode Technical Standard (UTS) is an independent specification. Conformance to the Unicode Standard does not imply conformance to any UTS.

A Unicode Technical Report (UTR) contains informative material. Conformance to the Unicode Standard does not imply conformance to any UTR. Other specifications, however, are free to make normative references to a UTR.

As technical reports, including UAXs and UTSs, are developed, the Unicode Technical Committee approves the posting of preliminary versions or proposed updates for public review. Publication of these draft versions does not imply endorsement by the Unicode Consortium.

A Draft Unicode Technical Report (DUTR) has the basic structure and content required, but still is not ready for final approval.

A Proposed Draft Unicode Technical Report (PDUTR) is in the earliest stages of development.

A Proposed Update of a UTR, UAX or UTS contains the draft of a proposed modification of an approved UTR, UAX or UTS.

Any technical report that is Draft, Proposed Draft, or Proposed Update is a preliminary document which may be updated, replaced, or superseded by other documents at any time. Such documents are not stable specifications; it is inappropriate to cite them as other than works in progress.

Development Process

Technical reports are created by the Unicode Consortium Technical Committees (UTC and CLDR-TC) in a open, consensus-oriented process. For more information about the approval process, see the FAQ on the Technical Reports Development Process. When appropriate, the Public Review Issues page solicits review and feedback on updates to technical reports.

One UTS, the Unicode Collation Algorithm, has an additional set of policies governing the maintenance of the basic data table used in assigning collation weights to characters. See Change Management for the Unicode Collation Algorithm and UCA Default Table Criteria for New Characters.

Versioning

Uniform and persistent revision numbers are used for all technical reports, even in the case when a UTR changes its status to a UAX or a UTS. This revision number is incremented and a new URL is provided each time the file is altered materially. Modifications to the report are summarized in a change history. Certain minor structural corrections of the HTML source, for example to fix broken links, may be made without a new revision number. In that case the date in the report header will be updated.

Each file has links to the previous approved version of the report and to the latest approved version of the report, allowing readers to find and cite particular versions.

  • For a UAX, the version number is of the form Unicode 4.0.1 to reflect the fact that it is part of a given version of the Unicode Standard.
  • Some UTSs use fractional version numbers to distinguish minor updates of the documents from major changes in the specification. Where a version of a UTS is synchronized to a version of the Unicode Standard, its version number corresponds to the version number of the standard, with an optional fourth version field for intermediate subversions of the UTS.
  • For all other UTRs and UTSs, the version number of the report is identical to the revision number.

Because revision numbers use whole numbers, rather than a major.minor.update version syntax, some UTRs have rather large version numbers. Many of the revision number changes, however, reflect rather minor editorial changes to the documents, as opposed to substantive changes to their contents.

UTSs containing data files that depend on the repertoire of the Unicode Standard may be revised for each repertoire change of the standard (minor version). Unless necessary, they are not revised for update versions of the Unicode Standard.

For more information about citing versions of technical reports, see Versions of the Unicode Standard.

Stable References to Sections of Technical Reports

The UAXs, UTSs, and some of the UTRs have stable HTML anchors defined for section headers. These enable direct links to those sections. In the more recent versions of technical reports, stable HTML anchors are also provided for tables, figures, each formal rule or definition, and the modification history of the document. For example:

Superseded or Withdrawn Reports

Occasionally, the material of a report is incorporated into another document, for example UAX #13 Newline Guidelines became Section 5.8, Newline Guidelines in the Core Specification of the Unicode Standard as of Version 4.0. Such reports are considered superseded and are listed in their own section on the Technical Reports page.

Instead of a link to the latest approved version, the reports page has a link to a page explaining the change in status (Example). Where applicable, that page also provides information on where the material was incorporated. A similar page is provided for reports which have been formally withdrawn. To prevent confusion about numbering of technical reports, the numbers of superseded or withdrawn reports are never reused.

Old Versions of Unicode

For several early versions of Unicode, the text of the specification was published initially as a Unicode Technical Report or even as a Unicode Standard Annex. Such reports are listed in their own section on the Technical Reports page. Instead of a link to the latest approved version, the reports page has a link to a page providing a summary for that version of the Unicode Standard. (Example)

Data Files

Data files for UAXs are maintained in the Unicode Character Database (UCD). These data files are versioned with the same three level version number as for the Unicode Standard.

Data files for UTSs or UTRs are maintained in separate, versioned folders under http://www.unicode.org/Public/. The location of each set of data files is documented in the corresponding UTS or UTR. Each folder contains a complete set of data files for that version of the document.

Errata

On occasion, errata to technical reports and other specifications are posted on the Updates and Errata page. To report errors in published documents, such as the Unicode Standard itself or technical reports, you may use the Unicode Consortium's contact form.


Access to Copyright and terms of use