Java Reference Concepts

Before you turn your attention to solving memory leak problems, it is important that you understand how Java references work. There are four different types of references in Java: strong, weak, soft, and phantom.

Strong References

Strong references are the mainstay of the Java world. As I look at the stacks and stacks of Java tutorials on my desk, I can't find one code sample that uses anything other than strong references. A strong reference is a reference type that pins objects in the memory so that the objects can't be garbage collected. Strong references are used whenever an assignment is made to a normal variable. For example, consider the following code:

String name = "Robert Simmons";
Set nameSet = new HashSet( );
set.add(name);


This code creates three different strong references. The reference name points to the memory location holding the contents of the immutable String object. The reference nameSet holds the memory location of the newly created Set. Finally, another strong reference to the same object that name points to is inside the HashSet. This third reference was created when the name was passed to the add( ) method. In every virtual machine, there is a set of references called the root set, which is created and maintained by the virtual machine. From the root set, each of the objects in the virtual machine is referenced. When you create a main function in your program, all of the objects declared in this method are attached to the root set. Consequently, each of the objects declared in these objects are attached to the root set indirectly. Whenever a path of strong references cannot be drawn from the root set to an object, the object is marked for garbage collection and is removed on the next pass. This process recurses throughout the network until all garbage-collectible references are gone. To make this a little easier to understand, consider the typical Java program shown in Screenshot-1.

Screenshot-1. Strong-reference structure in a typical program
Java figs/HCJ_1101.gif

In this diagram, all of the objects can be traced with strong references to the root set. For example, the root set has a reference to object A, which in turn has a reference to object G, and so on. In this state, the garbage collector would have nothing to do. If you set the reference from A to B as null, you get the result shown in Screenshot-2.

Screenshot-2. Strong references after reference deletion
Java figs/HCJ_1102.gif

When A drops its reference to B, the virtual machine checks to see whether it can trace a path from the root set to the object through strong references. It discovers that there is still a path. The path C Screenshot F Screenshot X Screenshot D Screenshot J Screenshot E Screenshot B satisfies the requirement. This type of construct is quite common in Java apps. You can think of E, J, and X as data objects with appropriate listener callbacks to the objects that are listening to them. In this structure, nothing is garbage collected. To bring this discussion from the theoretical back to the practical, let's revisit the GUI problem discussed earlier. See Screenshot-3.

Screenshot-3. Strong references in a GUI app
Java figs/HCJ_1103.gif

Now it should be a bit clearer why the memory leak occurs. The data object has a list of registered property change listeners that are strong references. The data object is not responsible for adding or removing listeners from this list. The listeners themselves must do this through the addPropertyChangeListener and removePropertyChangeListener methods. The second panel is still pinned in memory by the listener reference from the data object. The data object is pinned by the cache and GUI panel. Now, any memory associated with the second panel is not available. All of the buttons, menus, and other data objects it was holding are pinned in memory with it. The end result is a cascade that will increase your program's memory consumption throughout the execution until you restart it—all because of one line of missing code, which will be a pain to find in a large app. You can solve the above problem with strong references—just don't forget anything. If you make even one tiny mistake in your code, or if you forget to remove just one listener, you could strand huge amounts of objects in memory. You would have to choose between a very long session with your profiler or explaining to your network administrators why every computer in the company needs to have a gigabyte of memory just for your app.

Weak References

A weak reference does not pin an object in memory. When an object is no longer referred to with any strong references and has only weak references remaining, it is eligible for garbage collection. In Screenshot-4, weak references end with a double empty arrow, and strong references end with a single solid arrow.

Screenshot-4. Weak references in an app
Java figs/HCJ_1104.gif

Note that there are many weak references breaking up those dangerous, circular, strong-reference patterns. This time, when A drops its reference to B, the result is the structure shown in Screenshot-5.

Screenshot-5. Weak references after deletion
Java figs/HCJ_1105.gif

Deleting the reference from A to B causes B, E, and J to be garbage collected. When the strong reference from A to B is dropped, the virtual machine tries to find a strong-reference path from the root set to B, but it can't. The old path of C Screenshot F Screenshot X Screenshot D Screenshot J Screenshot E Screenshot B is blocked between D and J because the reference from D to J is a weak reference. The circularity is broken, and the objects without paths are garbage collected. Similarly, if you dropped the X-to-D reference, D and X would both be garbage collected. By strategically distributing weak references in your program, you can eliminate circularity and thus remove memory leaks. Later in this chapter, we will discuss when to use weak references and when to use strong references. Now you can make sure that an object is garbage collected whenever you remove references. This is useful. However, if you want only to clean out references when free heap memory is low, you need to use soft references.

Soft References

Soft references are similar to weak references except they are, theoretically, cleared only when the memory on the heap is low. Therefore, prior to allocating another heap block, the virtual machine attempts to clear memory by dropping all the weak references. If this still isn't enough for the requested new allocation, the virtual machine will clear soft references. So using a soft reference is a way of telling the virtual machine, "If you really need that memory, you can take that object, but don't do it unless you really need it." Unfortunately, soft references are not implemented correctly on many virtual machines. Most virtual machines simply treat soft references like weak references; regardless of whether memory is needed, they wipe out these objects on a garbage-collection pass. You should consult the documentation for your virtual machine to see how it implements soft references. Soft references are very useful for tools such as cache managers (at least when they are working correctly). As a non-Java example, consider your Internet browser. To speed up the loading of files, it caches the various files and images in your computer. It is convenient if these objects are available in the cache, but this is not necessary. If an object isn't in the cache, the browser would fetch the document from the Internet. This is a good example of how a soft reference would be used. Later in this chapter, we will examine an example of how to take advantage of soft references in your programs.

Phantom References

Phantom references are completely different from weak or soft references. They are references to data that has already been garbage collected. The fact that there can be references to data that has already been garbage collected may seem a bit counterintuitive. However, you don't access the object itself, only a reference to the object. Therefore, you cannot use a phantom reference to call methods or access fields of the object; it simply notifies you that the object has been garbage collected. Phantom references are useful if you need to do some cleanup in your app. Suppose you have a class that is watching the contents of a container but doesn't want to interfere with the objects in the container; its job is to count the number of objects deleted from the container. You want a reference to the object when it is completely gone so you can modify your count. Phantom references do the job. Since they refer only to objects that are completely dead, you can be sure your count is accurate.

References and Referents

To implement weak references, the virtual machine uses a two-layered approach, as shown in Screenshot-6.

Screenshot-6. UML diagram of a reference structure
Java figs/HCJ_1106.gif

The object that is the target of the weak, soft, or phantom reference is known as the referent. The reference object itself is known as the reference. Furthermore, the referent does not know anything about the reference object. Any object in the virtual machine can be a referent, but only the classes in the java.lang.ref package can be references. This is because the reference classes are tightly integrated with the garbage collector. You can extend the references to add more information, but you cannot create new kinds of references. We will discuss the actual implementation of the reference classes later in this chapter.

Reference Queues

A reference queue is used to notify the reference user of the various events that occur when objects in memory are altered. A programmer can use these reference queues to be notified when an object is marked for garbage collection or when it is garbage collected. With reference queues, you communicate with the virtual machine about memory management and garbage collection. References do not actually appear in the reference queue until the virtual machine decides to garbage collect the object, or the object is forcibly queued by a method call. For example, if there is a weak reference to an object, and the object is garbage collected, the reference will appear in the queue. You can then act on that reference to remove the information from your collections. For this to work, you have to create a reference queue and give the reference to the reference queue when the weak reference is created. This allows the reference to register itself with the queue. Later, we will look at an example of how to use reference queues.

The Garbage-Collection Process

Garbage collection is one of the core processes of Java. However, it is not as simple as it may appear at first. The requirements of weak references, soft references, and reference queues complicate the process. To make garbage collection a bit clearer, let's examine the UML activity diagram in Screenshot-7.

Screenshot-7. Virtual-machine garbage collection
Java figs/HCJ_1107.gif

This diagram shows all of the steps that are taken when a reference to an object is removed from the virtual machine. This happens whenever the user explicitly sets a variable to null, overwrites a reference variable to point to another object, or causes the class holding the reference variable to go out of scope. For each reference that the virtual machine has, it executes this process during its garbage-collection cycle. First, the virtual machine tries to trace a path from the root set to the object. If it finds a path, no garbage collection takes place. If it doesn't find a path, the virtual machine removes the referenced object from memory. Next, the virtual machine determines whether there are reference objects pointing to the removed reference, including weak, soft, and phantom references. If there are reference objects, the virtual machine determines whether they are soft references. If so, it does a memory check. If the memory check indicates that there is no free memory or that the reference is a weak or phantom reference, the referent in the reference is set to null. Finally, if the reference objects are registered with a queue, the reference is added to the queue. Note that this process happens only during garbage collection. Because of this, it appears that a second thread silently modifies the objects, which in fact does occur. Now it's time to put this thread to work in your app.

      
Comments