UCN (Java) notation beyond the BMP

From: DougEwell2@cs.com
Date: Tue May 22 2001 - 10:08:01 EDT


Is there a currently accepted format for Universal Character Names (also
known as Java escape sequences) for the Unicode characters beyond U+FFFF?

As an example, I can use "\u16f0" to get a Runic Belgthor, but I can't use
"\u10335" or anything like it to get a Gothic Qairthra*; as far as I know, I
must resort to UTF-16 surrogates and use "\ud800\udf35" instead, which seems
like a kludge.

Please don't use this question as an opportunity to propose a new escape
sequence, especially one like "\v" that is incompatible with current
practice. I am interested in a solution that is either official or has
already been proposed by the experts.

-Doug Ewell
 Fullerton, California

*Note: these particular characters were chosen mainly for giggle value.



This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:18:17 EDT