[Unicode]  Frequently Asked Questions Home | Site Map | Search
 

FAQ Topics and Categories

The Unicode Frequently Asked Questions are organized into different topic pages. The list of topic areas is shown below, along with brief explanations of what kinds of questions are answered in each topic area.  Click on a topic area to go to that page. The questions in each FAQ page are listed at the top of the page, and are linked to bookmarks further down the page. Many FAQ pages contain links to other pages where you will find further information about specific topics.

You may also find it useful to use the search page.  If you choose "Subject Search (fuzzy)" then you will see the context around the search term. For example, go to the search page and type in "solidus FAQ", "BOM FAQ", or "magic wand FAQ".

The FAQs are collected from many sources. For an explanation, see Attribution.

Basic Questions
Discusses the features of Unicode, how it differs from other encodings, and answers basic support questions.
Blocks and Ranges
Definitions and usage of Unicode blocks and ranges, and questions about blocks versus script values for characters.
Character Properties, Case Mappings and Names
Answers questions about case conversions and case mappings; also about character names.
Characters and Combining Marks
Discusses a variety of details about text elements, combining characters, compatibility mappings, canonical equivalence...
CLDR and Locales
Answers questions about Unicode Locales, CLDR, and LDML.
Chinese, Japanese and Korean
Questions specific to Han ideographs, CJK language handling, Hangul and Jamo characters, and East Asian fonts.
Collation 
Answers to questions of sorting and ordering, Unicode and Java.
Compression
The Unicode compression algorithm, LZW, Huffman encoding, and others.
Conversions / Mappings
Conversion and mapping to/from other character sets.
Coping with Change
Adapting to changes in the Unicode Standard.
Display of Unsupported Characters
Discusses what to do when attempting to display unsupported Unicode characters.
Fonts & Keyboards
Where to find more information about fonts. Displaying characters in Java.
Greek
Questions specific to the Greek language, script, and fonts.
Indic Scripts and Languages except Tamil
Questions specific to Indic scripts, languages, fonts, and keyboards.
Language Tagging
Plane 14 language tags and language tagging in general.
Ligatures, Digraphs and Presentation Forms
Can't find a certain digraph or ligature your language needs?  Can you use a particular presentation form?
Line Breaking
Questions about how to break text into separate lines for display.
Middle Eastern Scripts and Languages
Questions about Arabic, Hebrew, and other Middle Eastern scripts.
Normalization
Questions regarding the various normalization forms, their use, and where to go for further information.
Programming Issues
Questions regarding conversion of string handling in old programs, as well as other issues regarding support of Unicode strings in programs.
Proposed New Characters
What are the latest proposals?  What about my script?  When will the next Unicode book be available?
Security Issues
Does Unicode pose security problems? What can be done about such problems as character spoofing?
Specifications
Information on where to find specifications or guidelines for dealing with different programming tasks in the Unicode Standard and related standards.
Submitting Successful Character and Script Proposals
Guidelines on how to write a successful proposal to add new characters or a new script, or to fix a problem in the standard.
Tamil Script and Language
Issues relate to the Tamil language and script
Technical Reports Development Process
Discusses the development and maintenance process for technical reports, including how they are created and archived.
Unicode and ISO 10646
Relationships between Unicode and ISO working groups, ISO standards.  How Unicode differs from 10646.
Unicode and the Web
Unicode in other standards (W3C, IETF, etc).  How do deal with numeric character references, Unicode in HTML, etc.
UTF-8, UTF-16, UTF-32 & BOM
Questions about encoding forms (UTF-8, UTF-16, and UTF-32), definitions of a UTF (Unicode Transformation Format), and use of the byte order mark.
Writing Direction and BIDI Ordering
Questions about writing direction, particularly "bidi" bidirectional left-right and right-left text.