Version 3.0.1

Revision 3.0.1
Authors Mark Davis and Ken Whistler
Date 2000-08-17
This Version http://www.unicode.org/Public/3.0-Update1/UnicodeCharacterDatabase-3.0.1.html
Previous Version http://www.unicode.org/Public/3.0-Update/UnicodeCharacterDatabase-3.0.0.html
Latest Version http://www.unicode.org/Public/UNIDATA/UnicodeCharacterDatabase.html

Copyright © 1995-2000 Unicode, Inc. All Rights reserved.


The Unicode Character Database is provided as is by Unicode, Inc. No claims are made as to fitness for any particular purpose. No warranties of any kind are expressed or implied. The recipient agrees to determine applicability of information provided. If this file has been purchased on magnetic or optical media from Unicode, Inc., the sole remedy for any claim will be exchange of defective media within 90 days of receipt.

This disclaimer is applicable for all other data files accompanying the Unicode Character Database, some of which have been compiled by the Unicode Consortium, and some of which have been supplied by other sources.

Limitations on Rights to Redistribute This Data

Recipient is granted the right to make copies in any form for internal distribution and to freely use the information supplied in the creation of products supporting the UnicodeTM Standard. The files in the Unicode Character Database can be redistributed to third parties or other organizations (whether for profit or not) as long as this notice and the disclaimer notice are retained. Information can be extracted from these files and used in documentation or programs, as long as there is an accompanying notice indicating the source.


The Unicode Character Database is a set of files that define the Unicode character properties and internal mappings. For more information about character properties and mappings, see The Unicode Standard.

The Unicode Character Database has been updated to reflect Version 3.0 of the Unicode Standard, with many characters added to those published in Version 2.0. A number of corrections have also been made to case mappings or other errors in the database noted since the publication of Version 2.0. Normative bidirectional properties have also been modified to reflect decisions of the Unicode Technical Committee.

For more information on versions of the Unicode Standard and how to reference them, see http://www.unicode.org/unicode/standard/versions/.


Character properties may be either normative or informative. Normative means that implementations that claim conformance to the Unicode Standard (at a particular version) and which make use of a particular property or field must follow the specifications of the standard for that property or field in order to be conformant. The term normative when applied to a property or field of the Unicode Character Database, does not mean that the value of that field will never change. Corrections and extensions to the standard in the future may require minor changes to normative values, even though the Unicode Technical Committee strives to minimize such changes. An informative property or field is strongly recommended, but a conformant implementation is free to use or change such values as it may require while still being conformant to the standard. Particular implementations may choose to override the properties and mappings that are not normative. In that case, it is up to the implementer to establish a protocol to convey that information.


The following summarizes the files in the Unicode Character Database.  For more information about these files, see the referenced technical report(s) or section of Unicode Standard, Version 3.0.

UnicodeData.txt (Chapter 4, UTR #21: Case Mappings, UAX #15 Unicode Normalization Forms)

PropList.txt (Chapter 4)

SpecialCasing.txt (Chapter 4, UTR #21: Case Mappings)

Blocks.txt (Chapter 14)

Jamo.txt (Chapter 4)

ArabicShaping.txt (Section 8.2)

NamesList.txt (Chapter 14)

Index.txt (Chapter 14)

CompositionExclusions.txt (UAX #15 Unicode Normalization Forms)

LineBreak.txt (UAX #14: Line Breaking Properties)

EastAsianWidth.txt (UAX #11: East Asian Character Width)

BidiMirroring.txt (UAX #9: The Bidirectional Algorithm)

CaseFolding.txt (UTR #21: Case Mappings)

NormalizationTest.txt (UAX #15 Unicode Normalization Forms)