Juliusz Chroboczek wrote:
> I believe that Java strings use UTF-8 internally.
.class files use a _modified_ utf-8. at runtime, strings are always in 16-bit unicode.
> At any rate the
> internal implementation is not exposed to applications -- note that
> `length' is a method in class String (while it is a field in vector
but length() and charAt() are some of the apis that expose that the internal representation is in 16-bit unicode, at least semantically. length() counts 16-bit units from ucs-2/utf-16, not bytes from utf-8 or code points from utf-32. all charAt() and substring() etc. behave like that.
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:15 EDT