Immutable Problems

Although immutable objects are extremely useful in creating solid code, they can cause problems if you aren't paying attention. The most important thing to remember about immutable objects is that whenever you try to change them, you actually end up creating new objects that are themselves immutable. This can result in some extremely slow code. The String trap is one of the most prevalent examples of this problem.

The String Trap

The most commonly used immutable type in the Java language is java.lang.String. However, many good developers don't know that String is immutable. They are fooled by all of the "operations" that can be done to a String. They often say, "How can String be immutable? I can concatenate strings and replace values." However, these well-meaning developers are wrong! Whenever you perform an operation on a string, you are actually creating copies of the string that are modified to accommodate your request. This applies to concatenation as well as to operations such as splitting and replacing strings. However, the fact that this nuance of Java is not well-known is the cause of many common programming problems. For example, consider the following code used to concatenate strings in a sentence:

public void buildSentence (String[] words) {
 String sentence = new String( );
 for (int idx = 0; idx < words.length; idx++) {
 sentence += " " + words[idx];
 } }


The problem with this code is that at each and every iteration of the for loop, the virtual machine allocates an entirely new String object; this new object contains the characters in the sentence variable concatenated with the word being added. Since the String object is immutable, a new String object must be created to reflect each modification. Assuming that 234,565 words are being added, you may want to take a long lunch. If that isn't bad enough, remember that all of those intermediary String objects that this code created are not purged from memory until garbage collection is run, at which point the program will hit another speed bump the size of Mt. Everest. So, on top of a slow program, you have a program that eats memory like candy. This problem is not unique to String objects. Every immutable type suffers from the fact that the compiler must allocate an entirely new object to change the object. The solution to the allocation problem in the previous example is to use java.lang.StringBuffer:

public void buildSentence (String[] words) {
 final StringBuffer sentence = new StringBuffer(1000);
 for (int idx = 0; idx < words.length; idx++) {
 sentence.append(" ");
 sentence.append(words[idx]);
 } }


This version of the method from Example 1-7 is much improved. The StringBuffer is a mutable object, so it can be changed without memory reallocation. The compiler will allocate a new object only when the StringBuffer exceeds its capacity.

Screenshot

You can even tune the buffer instance to your needs to optimize these allocations. For concatenations that aren't very long, such as dynamic SQL, you can get away with 500 or 1,000 as the buffer size. For those that are extremely long, you can put in bigger initial sizes and increments:

final StringBuffer moreSpace = new StringBuffer(2500);

Your previously slow program now runs like a Ferrari, and takes up only a fraction of the memory it did before.

Buffering Bad Performance

Now that you have the concept of immutable types firmly in mind, you know that you should use StringBuffer objects when trying to concatenate Strings. On top of that, you have mastered the StringBuffer class and are now ready to use it in your toString( ) methods. However, before you get too excited, let's examine a couple of common StringBuffer pitfalls:

package oracle.hcj.immutable;
public final static String dumpArray(int[] array) {
 final StringBuffer buf = new StringBuffer(500);
 buf.append("{");
 for (int idx = 0; idx < array.length; idx++) {
 buf.append("[" + idx + "] " + array[idx]);
 }
 buf.append("}");
 return buf.toString( );
}


The problem with this code is that it is a performance hog. Referring to the Javadoc documentation on the StringBuffer class, you will find that whenever you concatenate Strings using the + operator, the virtual machine creates a new StringBuffer object to do the concatenation. Since you already have a StringBuffer, this is wasteful; why not reuse the buffer you already created? In the previous method, you created not just one StringBuffer, but an additional one for each iteration in the loop. Each time this method runs, it allocates (words.length + 1) StringBuffer objects. Since allocation within a loop should be avoided, rewrite this method to decrease processor time and memory usage:

package oracle.hcj.immutable;
public final static String dumpArrayBetter(int[] array) {
 final StringBuffer buf = new StringBuffer(500);
 buf.append("{");
 for (int idx = 0; idx < array.length; idx++) {
  buf.append("[");
 buf.append(idx);
 buf.append("] ");
 buf.append(array[idx]);
 }
 buf.append("}");
 return buf.toString( );
}


In this revised method, you create only one StringBuffer instance and append to that buffer in each iteration. The method contains more lines, but it is now lightning quick. If you're not convinced of the impact of these changes, run the oracle.hcj.review.BufferingBadPerformance class in this tutorial's example code. But beware: the micro program may appear to hang because of this allocation snippet:

 public static final void dumpArrayReallyBad(int[] array) {
 String result = new String("}");
 for (int idx = 0; idx < (array.length); idx++) {
 result += "[" + idx + "] " + array[idx];
 }
 result += "}";
 }


This method uses String allocations and is horribly slow. It can literally take minutes to run with values that the other methods crunch in seconds. If you want to try the test with high values, I suggest you either comment the execution of this method out or get a small tutorial to read, perhaps War and Peace. A sample run of the program is shown here:

>ant oracle.hcj.review.BufferingBadPerformance run_example run_example:
 [java] Building 10000 element Fibbonacci Number array took 0 millis
 [java] Using dumpArray took 150 millis
 [java] Using dumpArrayBetter took 40 millis
 [java] Using dumpArrayReallyBad took 88347 millis


The dumpArrayBetter( ) method was significantly faster than the dumpArray( ) method. However, the output shows that dumpArrayReallyBad( ), which uses string allocations, took more than 88 seconds to run! If you choose values bigger than 10,000, your virtual machine will probably run out of memory before dumpArrayReallyBad( ) is done.

Screenshot

When you run the examples, your times may vary depending on the speed of your computer and other programs you are running, but the ratios should consistently represent the claims of improved performance achieved by using a single buffer.


Although small improvements in speed, such as the difference between dumpArray( ) and dumpArrayBetter( ), may not seem very important, they will add up quickly if the affected method is run thousands of times. Consider a method run 20,000 times in the course of a batch processing call. If you shave just 15 milliseconds off each iteration loop, you save 15 x 20,000, or 300,000 milliseconds. A call that took five minutes before now runs in the blink of an eye.

      
Comments