[Unicode]  The Unicode Standard Home | Site Map | Search
 

Components of The Latest Version of The Unicode Standard

The following lists the components of the latest version of the Unicode Standard. The version numbering, symbols, and the role of each component are explained in Versions of The Unicode Standard.

Note: All files available via HTTP are mirrored and available via FTP. Thus either http://www.unicode.org/Public/ or ftp://www.unicode.org/Public/ can be used.

 


 

The Unicode Consortium. The Unicode Standard, Version 5.1.0, defined by: The Unicode Standard, Version 5.0 (Boston, MA, Addison-Wesley, 2007. ISBN 0-321-48091-0) (http://www.unicode.org/versions/Unicode5.0.0/), as amended by Unicode 5.1.0 (http://www.unicode.org/versions/Unicode5.1.0/)

The following is a sample reference format for a UAX. For more examples, see the References section of the Versions page.

Unicode Standard Annex #15, "Unicode Normalization Forms," by Mark Davis and Martin Dürst, an integral part of The Unicode Standard. (http://www.unicode.org/reports/tr15/)

The latest version of the Unicode Standard is defined by the following list. The version numbering and the role of each component are explained in Versions of The Unicode Standard. For a summary of the contents of this version, see Unicode 5.1.0.

Major Reference
The Unicode Consortium. The Unicode Standard, Version 5.0
Boston, MA, Addison-Wesley Developers Press, 2007. ISBN 0-321-48091-0.
Unicode Standard Annexes
UAX #9: Unicode Bidirectional Algorithm
UAX #11: East Asian Width
UAX #14: Unicode Line Breaking Algorithm
UAX #15: Unicode Normalization Forms
UAX #24: Unicode Script Property
UAX #29: Unicode Text Segmentation
UAX #31: Unicode Identifier and Pattern Syntax
UAX #34: Unicode Named Character Sequences
UAX #38: Unicode Han Database (Unihan)
UAX #41: Common References for Unicode Standard Annexes
UAX #42: Unicode Character Database in XML
UAX #44: Unicode Character Database
Unicode Character Database
http://www.unicode.org/Public/UNIDATA, or
ftp://www.unicode.org/Public/UNIDATA
Documentation
  Index.txt
  NamesList.html
  ReadMe.txt
      StandardizedVariants.html
  UCD.html
  Unihan.html
Core Data
  ArabicShaping.txt
  BidiMirroring.txt
  Blocks.txt
  CompositionExclusions.txt
  EastAsianWidth.txt
  HangulSyllableType.txt
  Jamo.txt
  LineBreak.txt
  NameAliases.txt
  NamedSequences.txt
  NamedSequencesProv.txt
  NamesList.txt
      NormalizationCorrections.txt
  PropertyAliases.txt
  PropertyValueAliases.txt
  PropList.txt
  Scripts.txt
  SpecialCasing.txt
      StandardizedVariants.txt
  UnicodeData.txt
  Unihan.txt (very large file, see Unihan.zip)
Derived Data
  CaseFolding.txt
  DerivedAge.txt
  DerivedCoreProperties.txt
  DerivedNormalizationProps.txt
Extracted Data
  DerivedBidiClass.txt
  DerivedBinaryProperties.txt
  DerivedCombiningClass.txt
  DerivedDecompositionType.txt
  DerivedEastAsianWidth.txt
  DerivedGeneralCategory.txt
  DerivedJoiningGroup.txt
  DerivedJoiningType.txt
  DerivedLineBreak.txt
  DerivedNumericType.txt
  DerivedNumericValues.txt
Conformance Test Data
     NormalizationTest.txt
Auxiliary Data for UAX #14 and UAX #29
     GraphemeBreakProperty.txt
     SentenceBreakTest.txt
     GraphemeBreakTest.txt
     LineBreakTest.txt
     SentenceBreakProperty.txt
     WordBreakProperty.txt
     WordBreakTest.txt
Documentation for Auxiliary Data
     GraphemeBreakTest.html
     LineBreakTest.html
     SentenceBreakTest.html
     WordBreakTest.html