Unicode Support in Oracle9i Database: New Unicode Features
Jianping Yang & Gary Chen- Oracle Corporation
As global ebusiness continues its growth into every aspects of industry as a fundamental infrastructure for information and business management, it is becoming crucial to develop internal applications with multilingual capabilities. Unicode is widely accepted as a standard across many platforms in providing this multilingual capability.
Oracle has supported Unicode in the UTF-8 encoding since Oracle7. In the newest Oracle release Oracle9i, this support is further enhanced to better serve the global ebusiness needs. Codepoint semantics is introduced for text data so that UTF-16 semantics can be built upon UTF-8 encoding which will easily support application server that is built upon UTF-16. The benefit of this solution is to reduce the migration effort and to increase storage efficiency for Latin data. A Unicode data type is introduced to build Unicode application independent of database character set. This data type enables existing application to be gradually migrated into Unicode and it offers the choice of either UTF-8 or UTF-16 as its encoding for more storage efficiency based on data distribution.
This paper describes the functionality of codepoint semantics and the new Unicode data type in the Oracle9i database release. Design choices, such as codepoint or byte semantics, Unicode Database or Unicode data type, and UTF-8 versus UTF-16, will be discussed. A brief description of the new Unicode access interface will be given to round out the complete multilingual capabilities offered throughout the Oracle Development Platform.
|When the world wants to talk, it speaks Unicode|
International Unicode Conferences are organized by Global Meeting Services, Inc., (GMS).
GMS is pleased to be able to offer the International Unicode Conferences under an exclusive
license granted by the Unicode Consortium. All responsibility for conference finances and
operations is borne by GMS. The independent conference board serves solely at the pleasure
of GMS and is composed of volunteers active in Unicode and in international software
development. All inquiries regarding International Unicode Conferences should be addressed
Unicode and the Unicode logo are registered trademarks of Unicode, Inc. Used with permission.
22 Jun 2001, Webmaster