About the Unicode Character Database
The Unicode Character Database (UCD) consists of a number of data files listing character properties and related data along with
a documentation file
that explains the organization of
the database and the format and meaning of the data in the files.
All files for the most up-to-date version of the Unicode
Character Database can be found at:
http://www.unicode.org/Public/UNIDATA/.
Files in the UNIDATA
directory are unversioned: they do not contain any version
indicator in their file name. However, where a file header is
present, it indicates the UCD version for which that file was last
revised.
Whenever the Unicode Character Database gets updated, an update
directory is placed in this directory:
http://www.unicode.org/Public/
Each update directory includes only those files that were
changed from previous updates. Files in the update directory are
versioned: they contain a
version indicator in their filename.
The
complete set of all files for a given version of the UCD consists of the
files in the update directory for that version, together with all
the files
unchanged from earlier versions, which are kept in their respective
update directories.
For a comprehensive list of all files that make up a given version of the
UCD, look for the Unicode Character Database section of a given
version in the Versions of
the Unicode Standard.
During periods when a preliminary (beta) version of
the standard is being released for public comment
Public Beta files
are available. For more information about any ongoing public betas see
the BETA notice
as well as Public Review
Issues.
FTP Access
All files and directories in the Unicode Character Database are
accessible both via HTTP and FTP. For FTP access substitute "ftp:" for
"http:" in any of the links given
above.
For example, to access the contents of
http://www.unicode.org/Public/UNIDATA/ by FTP, use the following
modified URL:
ftp://www.unicode.org/Public/UNIDATA |