UTF-8 and string manipulations in Java

From: ktadenev@ups.com
Date: Wed Jan 07 2009 - 09:42:17 CST

  • Next message: James Kass: "Re: Emoji: emoticons vs. literacy"

    I have a question on Java internal data manipulations as they pertain to UTF-8 strings.

    Are these statements correct?

    1. java.lang.String expects UTF-8 data and any data manipulations appear to a Java programmer as being performed in UTF-8
    2. Internally, when a string manipulation method is invoked (e.g., length(), charAt(int), etc.), Java converts the string content to UTF-16, performs the requested manipulation and converts the content back to UTF-8. None of this is visible to the Java developer

    I would appreciate any insight...

    Thank you,

    Konstantin Tadenev
    Database Architect
    Enterprise Information Architecture
    Location RO3C-123
    340 McArthur Blvd
    Mahwah, NJ 07430
    Phone: (201) 828-4076

    This archive was generated by hypermail 2.1.5 : Wed Jan 07 2009 - 10:08:16 CST