Unicode Support in Oracle9i Database - New Unicode Feature
Jianping Yang &Gary Chen - Oracle Corporation
As global ebusiness continues its growth into every aspect of industry as an infrastructure for information and business management, it is becoming very crucial to make internet application with multilingual capability. Unicode is widely used as a base to provide this capability.
Oracle supports Unicode as UTF-8 encoding since Oracle7. In the newest Oracle9i release, this support is further enhanced to better serve the global ebusiness needs. Codepoint semantics is introduced for text data so that UTF-16 semantics can be built upon UTF-8 encoding which will easily support application server that is built upon UTF-16. The benefit of this solution is to reduce the migration effort and to increase storage efficiency for Latin data. A Unicode data type is introduced to build Unicode application independent of database character set. This data type enables existing application to be gradually migrated into Unicode and it can pick either UTF-8 or UTF-16 as its encoding for more storage efficiency based on data distribution.
This paper describes the functionality of codepoint semantics and the new Unicode data type in Oracle9i database release. Design choices, such as codepoint semantics vice Unicode data type, UTF-8 vice UTF-16, will be discussed. A brief description of new Unicode access interface will be given.
|When the world wants to talk, it speaks Unicode|
International Unicode Conferences are organized by Global Meeting Services, Inc., (GMS).
GMS is pleased to be able to offer the International Unicode Conferences under an exclusive
license granted by the Unicode Consortium. All responsibility for conference finances and
operations is borne by GMS. The independent conference board serves solely at the pleasure
of GMS and is composed of volunteers active in Unicode and in international software
development. All inquiries regarding International Unicode Conferences should be addressed
Unicode and the Unicode logo are registered trademarks of Unicode, Inc. Used with permission.
11 December 2000, Webmaster