Memory Usage and the Garbage Collector

When working with memory and garbage collection, keep a couple goals in mind:

  • Don't use too much memory for the target system you're writing the game for, whether it's 64MB or 256MB.
  • Try to limit garbage-collection pauses as much as possible.

Pauses due to the garbage collector can make your game seem jerky. However, if you limit the number of objects you create per frame and tweak the Java heap settings, you can get around many garbage-collection pauses. In this section, we go over some techniques to monitor memory usage and tune heap settings to get things to run a bit more smoothly.

The Java Heap and Garbage Collection

Java memory is stored in a heap—a large chunk of memory—that is managed by the VM. This has a few advantages over letting the operating system manage memory because HotSpot can optimize memory management for typical Java apps. By default, HotSpot starts with a small heap, such as one a couple megabytes. As more objects are created, the heap grows as needed. When no more references to an object exist, HotSpot can clear the object from the heap. This is called garbage collection. Note that some memory exists outside the heap, such as NIO buffers, memory-mapped files, or images in video memory. The heap is partitioned into two sections: young and tenured. Objects are first created in the young section. Object creation in this section is fast, often involving just a pointer increment. If an object is around long enough, it is moved to the tenured section. The idea here is that most objects are around for only a short amount of time. For example, an Iterator object created to traverse the items in a list usually is in use for a short time. HotSpot can clear the unreferenced objects in the young section fairly quickly, so creating a few short-lived objects won't have much of an impact on performance. Clearing unreferenced objects in the tenured section is a bit slower, however, because this section is larger and involves many long-lived objects. This is where you might see a noticeable pause from the garbage collection. Ideally, you want to limit tenured garbage collections as much as possible. To do this, though, you need to take a look at how much memory the game is using and when the garbage collections occur.

Monitoring Garbage Collection

Garbage collection can impact game performance, but luckily there are some ways to make it a less significant issue. You can't get rid of garbage collection completely, but you can make its effects almost negligible. First, though, it's a good idea to monitor garbage collection to get a good idea of how your game is handled by the garbage collector. You can get on information whenever garbage collection occurs by using the -verbose:gc flag:

java -verbose:gc MyCoolGameClass
[GC 16373K->16311K(17760K), 0.0028920 secs]
[GC 17316K->17311K(18568K), 0.0028735 secs]
[Full GC 17311K->6412K(18568K), 0.0587969 secs]


Each line shows a particular garbage collection. Garbage collections that involve just the young section are marked as GC, and garbage collections that involve the tenured section are marked as Full GC. The first numbers (17311K->6412K) show the amount of memory used before and after the garbage collection. The number in parenthesis (18568K) is the total heap space (not including some of the heap used by HotSpot itself). Finally, the amount of time taken to perform the collection is shown. As you can see, the young garbage collections took only about 3ms each, which doesn't have a significant effect on performance. However, the tenured garbage collection (Full GC) took almost 60ms, which results in a few dropped frames. For example, if it takes 20ms to draw a frame (50 frames per second), the garbage collection took the time to draw about 3 frames. You can get more details from the garbage collections by using the -XX:+PrintGCDetails flag:

java -verbose:gc -XX:+PrintGCDetails MyCoolGameClass
[GC [DefNew: 1123K->0K(1152K), 0.0027800 secs]
 14013K->14012K(16292K), 0.0028174 secs]
[GC [DefNew: 376K->1K(1152K), 0.0013854 secs]
 14388K->14388K(16292K), 0.0014222 secs]
[GC [DefNew: 1014K->1K(1152K), 0.0026643 secs]
 [Tenured: 15396K->12400K(16152K), 0.0415779 secs]
 15401K->12400K(17304K), 0.0443534 secs]


Here, the garbage collections are broken down into the young section (DefNew) and the tenured section. The size of each section is shown, along with the total size. Here, the young section is 1152K, and the tenured section is 16,152K. Again, the young garbage collections took about 2–3ms, but the tenured collections took a total of about 45ms. Luckily, a few dropped frames are not noticeable, so the garbage collector isn't having much of an impact in this case. You can get still more garbage collection information by using the -XX:+PrintGCTimeStamps flag. This flag prints the time stamps of each garage collection, so you can get a good look at how often the collections occur. By adding a time stamp to your own logging, you can also easily match up when garbage collection is occurring relative to important sections of your own code.

Monitoring Memory Usage

If you're creating 20KB of garbage every time you draw a frame, at 60 frames per second, that comes out to about 1.2MBps, or nearly 12MB every 10 seconds. That's a lot of junk, a lot more work for the garbage collector to do. So, it's important to monitor the rate at which memory is allocated in your game, to get an idea of how you can tune the heap to better suit your needs. You can get information on the memory that each object uses by using the -Xaprof flag:

java -Xaprof MyCoolGameClass Allocation profile (sizes in bytes, cutoff = 0 bytes):
____Size__Instances__Average__Class________________
39954200 590 67719 [S
 2949360 10406 283 [B
 2838936 118289 24 com.brackeen...Vector3D
 1036904 9062 114 [C
 718848 3744 192 sun.java2d.SunGraphics2D
 475008 29688 16 sun.awt.MostRecentKeyValue
 421400 7525 56 java.awt.event.MouseEvent
 298464 18654 16 java.awt.Point
 273960 11415 24 java.lang.ref.WeakReference
 245936 3942 62 [Ljava.lang.Object;
 239680 3745 64 java.awt.geom.AffineTransform
 212736 8864 24 java.awt.EventQueueItem
 203760 8490 24 java.lang.String
 161504 5047 32 java.util.LinkedList$ListItr
 149760 3744 40 sun.java2d.loops.FontInfo
 141632 8852 16 sun.awt.EventQueueItem
 130560 8160 16 com.brackeen...ScanConverter$Scan
 129600 5400 24 java.util.HashMap$Entry
 119840 3745 32 sun.java2d.pipe.Region
 114864 4786 24 java.util.LinkedList$Entry


In this list, [S is a short array, [B is a byte array, [C is a char array (often used in the String class), and [Ljava.lang.Object is an array of objects (often used for ArrayList). This gives a look of every object that lived at one point or another, not necessarily the memory usage at any particular time. Keep this in mind when interpreting results. For example, a new sun.java2d.SunGraphics2D object was created for every frame drawn, and the total size of all those objects was about 700KB, even though probably only one of those objects was referenced at any given time. Obviously, a lot of these objects, such as the WeakReference and MouseEvent objects, are created by the Java API itself. In this example, the short arrays, used for polygon surfaces and usually created only once, take up the most amount of space. Using MIP mapping (as discussed in ) to create smaller surfaces for distance polygons would help reduce this usage. The Vector3D objects are using a large amount of space as well. This can be helped in a couple ways: First, Vector3D objects can be shared more often (in the current state, some abutting polygons have equal vertices but use different Vector3D instances). Second, there is a large overhead per object. You need only 12 bytes of data (4 bytes each for x, y, and z), but with object overhead and rounding to the 8-byte boundary, it comes out to 24 bytes per object. One idea to fix this is to create a Vector3DList object that would hold a list of vectors in an array like this:

float[] x, y, z;


This would reduce object overhead but would give array-access overhead instead. Because using -Xaprof gives a look at only memory used by the instances of each class, you need another way to monitor memory during the game. The Runtime class has a few methods to monitor the heap usage, notably totalMemory() and freeMemory(), which return the heap size and the amount of free heap space, respectively. You can use these methods to monitor the memory usage from frame to frame so you can see how much memory you're allocating (or garbage you're creating) every time you render a frame. To get the amount of memory allocated, just use totalMemory()-freeMemory(). This is summed up in the MemMonitor class in Listing 16.4.

Listing 16.4 MemMonitor.java
package com.brackeen.javagamebook.util;
/**
 Monitors heap size, allocation size, change in allocation,
 change in heap size, and detects garbage collection (when the
 allocation size decreases). Call takeSample() to take a
 "sample" of the current memory state.
*/
public class MemMonitor {
 /**
 The Data class contains info on a series of float values.
 The min, max, sum, and count of the data can be retrieved.
 */
 public static class Data {
 float lastValue = 0;
 float min = Float.MAX_VALUE;
 float max = Float.MIN_VALUE;
 float sum = 0;
 int count = 0;
 public void addValue(float value) {
 lastValue = value;
 sum+=value;
 count++;
 min = Math.min(min, value);
 max = Math.max(max, value);
 }
 public String toString() {
 return "Min: " + toByteFormat(min) + " " +
 "Max: " + toByteFormat(max) + " " +
 "Avg: " + toByteFormat(sum / count);
 }
 }
 private Data heapSize = new Data();
 private Data allocSize = new Data();
 private Data allocIncPerUpdate = new Data();
 private int numHeapIncs = 0;
 private long startTime = System.currentTimeMillis();
 /**
 Takes a sample of the current memory state.
 */
 public void takeSample() {
 Runtime runtime = Runtime.getRuntime();
 long currHeapSize = runtime.totalMemory();
 long currAllocSize = currHeapSize - runtime.freeMemory();
 if (currHeapSize > heapSize.lastValue) {
 numHeapIncs++;
 }
 if (currAllocSize >= allocSize.lastValue) {
 allocIncPerUpdate.addValue(
 (currAllocSize - allocSize.lastValue));
 }
 heapSize.addValue(currHeapSize);
 allocSize.addValue(currAllocSize);
 }
 /**
 Convert number of bytes to string representing bytes,
 kilobytes, megabytes, etc.
 */
 public static String toByteFormat(float numBytes) {
 String[] labels = {" bytes", " KB", " MB", " GB"};
 int labelIndex = 0;
 // decide most appropriate label
 while (labelIndex < labels.length - 1 && numBytes > 1024) {
 numBytes/=1024;
 labelIndex++;
 }
 return (Math.round(numBytes*10)/10f) + labels[labelIndex];
 }
 public String toString() {
 long time = System.currentTimeMillis() - startTime;
 float timeSecs = (float)time / 1000;
 return "Total Time: " + timeSecs + "s\n" +
 "Heap: " + heapSize + "\n" +
 "Allocation: " + allocSize + "\n" +
 "Allocation inc/update: " + allocIncPerUpdate + "\n" +
 "Num Heap Incs: " + numHeapIncs;
 }
}


In this code, the takeSample() method records the current state of the heap usage. In a game, you call this method after drawing every frame. The toString() method returns a string presenting statistics of the samples taken. I ran the test using the 3D engine from using the flags -verbose:gc and -XX:+PrintGCDetails and got these results:

Total Time: 241.859s Heap: Min: 10.7 MB Max: 63.6 MB Avg: 51.8 MB Allocation: Min: 5.5 MB Max: 43.3 MB Avg: 34.1 MB Allocation inc/update: Min: 384.0 bytes Max: 8.5 MB Avg: 6.1 KB Num Heap Incs: 5


Also, it reported 73 young garbage collections and 15 tenured garbage collections. That's one garbage collection about every 2.75 seconds! With this information, you can get an idea about how much memory the game allocates at a maximum (about 44MB). A lot of the garbage collections occur because you start with a small heap that grows in size. When the heap is small, the VM might work harder on garbage collections to try to keep everything in memory. If the memory monitor is accurate, it appears as if drawing a frame requires a minimum of 384 bytes, but on average, you're creating 6.1KB per frame. In this example, a lot of the allocation is caused by created polygon surfaces, which are generally permanent, but things such as new objects for projectiles are also created. Using this information, let's try to tune the heap a little to give fewer garbage collections.

Tuning the Heap

Let's make a goal with the game: Make one garbage collection every 10 seconds, and make the time required to perform a garbage collection short enough that only a frame or so is dropped each time. First pick a heap size. Note that the bigger the heap is, the longer a garbage collection will take. You don't want to just create an arbitrarily large heap. From the previous test, you know the game uses about 44MB and that the max heap size is about 64MB. So, just start with a 64MB heap. You can specify the starting heap size by using the -Xms flag, like this:

java -Xms64m MyCoolGameClass


Next, you know you are creating about 6.1KB per frame. This is just an estimate, and probably an inaccurate one, but it gives you an idea. If you want to collect the garbage once every 10 seconds and you want garbage-collection times to be short, make sure all the memory you create over 10 seconds fits in the young section of the heap. At 60 frames per second, that's 3660KB every 10 seconds. Make sure the young section starts off with about 4MB. You can specify the starting young size with the -Xmn flag, like this:

java -Xmn4m MyCoolGameClass


Okay, let's try it out! I ran the test again with these flags:

-Xms64m -Xmn4m -verbose:gc -XX:+PrintGCDetails


I got these results:

Total Time: 248.094s Heap: Min: 63.6 MB Max: 63.6 MB Avg: 63.6 MB Allocation: Min: 10.1 MB Max: 42.7 MB Avg: 38.8 MB Allocation inc/update: Min: 384.0 bytes Max: 10.1 MB Avg: 3.1 KB Num Heap Incs: 1


Also, it reported 19 young garbage collections, each between 2ms and 16ms, and no tenured garbage collections. At 248 seconds, that's about one minor garbage collection every 13 seconds, and at 60 frames per second, a 16ms garbage collection results in only one dropped frame. Not bad! You met your goal, but this time you also got a lowered average allocation per frame, at 3.1KB. This could be due to many factors, but my guess is that polygon surfaces are kept in memory the entire time because there is enough heap space. With a smaller heap, polygon surfaces might have been flushed out and then re-created later. Notice that these tests were very simple because you ran the demos for only about four minutes. You should gather more data, such as what happens with longer play (20 minutes or more) and with more runs of the game, before making any final decisions on heap tuning. Also, the MemMonitor class could easily be extended to show a visual representation of memory usage within the game. Or, when the game exits, the class could spit out a graph of memory usage from the start of the game until the end. This could help recognize the rate of memory creation for typical scenes and what scenarios cause memory usage to jump.

Tuning the Garbage Collector

Within a program, you can always force a garbage collection yourself by calling this:

System.gc();


For most apps, Sun recommends you don't call this method. However, for games, it might be a good idea to perform a garbage collection in certain situations, such as right before starting a new level or after the startup sequence, if you know you're creating a lot of garbage during those times. This way, you get off to a clean start. Besides the default garbage collector, HotSpot has a few more collectors that might work better in different situations. Here's a quick overview of the available collectors:

  • Incremental garbage collector (-Xingc). Also called the train garbage collector. This garbage collector spreads larger garbage collections over time instead of having longer pauses due to major garbage collections. Unfortunately, this garbage collector uses a lot of processor power, so I don't recommend using it for action games that hog the processor.
  • Parallel garbage collector (-XX:+UseParallelGC). Also called the throughput garbage collector. This performs young garbage collection in a separate thread and works best with multiprocessor machines.
  • Concurrent garbage collector (-XX:+UseConcMarkSweepGC). This performs tenured garbage collection with lower pauses and also works best with multiprocessor machines. The idea is that it can perform garbage collections concurrently while the program runs, instead of stopping the program. You can use a parallel young garbage collector by using (-XX:+UseParNewGC).

Most home users won't have multiprocessor machines, so the default garbage collector will be fine. But as always, experiment with these different collectors to see what kind of results you can get for your game.

Reducing Object Creation

You tuned the heap to make the garbage-collection times almost negligible by keeping the temporary objects you create in the young section of the heap. However, it's still important to create as few objects as possible so that garbage collection can occur less often. Note that you won't be able to avoid all object creation completely. A lot of object creation takes place in the Java API. If you're ever curious about whether a certain class or method creates any objects, look at the Java API source included with the SDK to get an idea of what kinds of objects are created and when. For example, getting an Iterator from a list creates a new Iterator object. Adding an object to a LinkedList creates a new object to store it in the list. And concatenating String objects creates a new object. A few ways to avoid those issues are to use ArrayLists instead of LinkedLists, traverse ArrayLists without using an Iterator, and use a StringBuffer instead of Strings when possible.

Object Reuse

Ideally, you want to try for object reuse as much as possible. In the code of this tutorial, we used plenty of scratch objects used for computations of, for example, polygon transforms. That way, we didn't create a new Vector3D object for every vertex of every polygon we transformed. If you can use one object from frame to frame instead of re-creating it every frame, do it. Sometimes you want to keep an object around only if there's enough memory for it. For example, with polygon surfaces, we didn't need a surface when it went off the screen, but we wanted to keep a copy of it in case we needed it again soon, if there was enough memory for it. In that case, we used soft references. If an object's only reference to it is a soft reference, the garbage collector doesn't collect the object unless there is not enough memory to keep the object around. Objects with no references are collected first. When the heap is full with no other objects to collect, the softly reachable objects are collected. This way, soft references can be used as a sort of object cache. Forgotten how to use soft references already? Well, then, check out for an overview.

Object Pools

An object pool is a collection of objects that can be reused during the execution of an app. Objects are removed from the pool when they are needed and are added to the pool when they are no longer needed. The idea is that using an object pool saves the overhead of continuously allocating and deallocating memory for those objects. Sun actually recommends against using object pools in modern VMs such as HotSpot. Object creation is much faster than it used to be, and if you're creating many short-lived objects, the young section of the heap will do a better job. Think of the young section of the heap as an object pool, and try to keep temporary objects in that section. If the young section starts to overflow with objects, the objects will move to the tenured section, which is slower to garbage-collect. So, if this happens, increase the size of the young section so it doesn't overflow. Probably the worst-case scenario is when you are creating large, temporary objects that live long enough to be copied to the tenured section of the heap and have to be garbage-collected from there. If increasing the young section of the heap doesn't fix this problem, using an object pool might help. Another scenario arises when it is computationally expensive to create an object. For example, building a polygon surface in the 3D engine is computationally expensive (this isn't a great example for an object pool, though, because each polygon surface is different). Most objects won't be computationally expensive to create, however. A third scenario is when you have some large tenured objects that you're finished with, but you don't want the garbage collector to remove them until, say, after a level is finished. If you know you have enough memory, you can put those objects in a pool and then clear the pool at a better time. There's really nothing special about object pools. Basically, you can implement an object pool by keeping a list of objects:

ArrayList objectPool = new ArrayList();


The pool can be filled with objects, and new objects can be removed from the end (removing from the end ensures the array elements don't have to shifted over, unlike if you removed from, say, the beginning). When you are finished with an object, you simply put it back in the list. Optionally, you can extend the idea of object pools using soft references. If each object in the pool is a soft reference, it will be cleared only if the heap runs out of memory. So, HotSpot would garbage-collect items in the soft object pool only if absolutely necessary. The pool would stick around, but the objects in the pool would get collected as needed.



   
Comments