This discussion is archived
2 Replies Latest reply: Jun 19, 2012 2:50 AM by sabre150 RSS

String instances added to pool using String::intern

kaja.mohideen Newbie
Currently Being Moderated
I'm involved in an application devlopement where we are generating CSV reports based on data collected in multiple maps. Since we know that many of the string values that we get from various sources could be same, we used String::intern before adding them to Map to reduce the memory consumption.

Environment:
Sun Solaris 10
Java 1.6 Update 27 (64 Bit)

We're facing issue now that some strings which we retrieved from sources are having correct values, but the report has incorrect values (in few columns of CSV).

Most of the string carry UTF-8 values (Japanese).

Please advise whether it is OK to use intern as I don't know when they'll beremoved from pool and how it performs for UTF-8 Strings.

Thanks in advance,

// Kaja
  • 1. Re: String instances added to pool using String::intern
    EJP Guru
    Currently Being Moderated
    String.intern() does not cause charset problems. Strings that have been interned can be garbage-collected just like any others.
  • 2. Re: String instances added to pool using String::intern
    sabre150 Expert
    Currently Being Moderated
    kaja.mohideen wrote:
    Most of the string carry UTF-8 values (Japanese).
    Strings in Java are always UNICODE values encoded a UTF-16 code points. You don't have UTF-8 strings or ISO-8859-1 strings or ASCII strings - you only have UNICODE strings. If Strings are derived from utf-8 bytes then the conversion must be made explicit in the conversion. For example String x = new String(bytes,"utf-8") will convert the utf-8 bytes to a String containing the UNICODE characters encoded as UTF-16 code points. An important point is that once you have a String you lose all knowledge of how that String was created and what it was created from since nothing is stored within String to indicate this.

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points