Re: accessing extended ranges

From: Markus Scherer (markus.scherer@jtcsv.com)
Date: Tue Apr 02 2002 - 21:07:29 EST


Addison Phillips [wM] wrote:

> ICU4J, the IBM opensource project, provides some UTF-16 support capabilities that suggest a possible solution, but there are seemingly intractable problems with the Character class and char data type (luckily most APIs in Java take int arguments for characters instead of char). And it is pretty easy to build classes for processing these characters as surrogate pairs using the Unicode character database.

(Late for this thread.)

ICU4J comes with its own "UCharacter" class that provides Unicode 3.1.1 properties for all code points, using int for the single-character type.
A class library can of course not fix the problem of string literals with \u - we either use two \u's for surrogate pairs or an unescape function (I think on the UTF16 class) that understands \U.

http://oss.software.ibm.com/icu4j/

markus



This archive was generated by hypermail 2.1.2 : Tue Apr 02 2002 - 22:11:49 EST