Addressing Performance or Scalability Problems

Once we know where our performance problems are, we can look at addressing them. The earlier they are brought to light, the better our chance of eliminating them without the need for major design changes or reworking existing code. There are many techniques for addressing performance or scalability problems. In this section we'll consider some of the most useful.

Server Choice and Server Configuration

Before looking at design modifications or code changes, we should ensure that the cause of the problem isn't external to the app. Choice of app server and server configuration will have a vital influence on app performance. It's vital to tune the app server to meet the needs of the apps it runs. Unless your code is particularly bad, this will usually produce a better return than optimizing code, at no cost in maintainability. The performance tuning parameters available differ between app servers, but they typically include the following, some of which we've discussed in previous chapters:

Tuning JVM options and operating system configuration is another productive area. Usually the tuning should be that appropriate to the app server, rather than any particular app running on it, so the best place to look for such information is in documentation supplied with your app server. Important JVM options include initial and maximum heap size and garbage-collection parameters. See for detailed information on Sun 1.3 JVMs. It may also be possible to disable unused app server services, to make more memory available to apps and eliminate unnecessary background thread activity. app server vendors tend to produce good guidelines on performance tuning and J2EE performance in general. Some good resources are:

Database configuration is equally important, and requires specialist skills.

Dispensing with Redundant Container Services

While J2EE app servers provide many valuable services, most of these services are not free. Using them unnecessarily may harm performance. Sometimes by avoiding or eliminating the use of unnecessary container services we can improve performance. For example, we can:

By far the most important performance gain is likely to bein avoiding unnecessary remote invocation; we discuss thisfurther below.

A simpler design is often a more performant design, and often leads to no loss in scalability.


One of the most important techniques to improve performance in J2EE apps is caching: storing data that is expensive to retrieve to enable it to be returned quickly to clients, without further retrieval from the original source. Caching can be done at many points in a J2EE architecture, but is most beneficial when it results in one architectural tier being able to avoid some calls to the tier beneath it. Caching can produce enormous performance gains in distributed apps, by eliminating remote calls. In all apps it can avoid calls from the app server to the database, which will probably involve network round trips as well as the overhead of JDBC. A successful caching strategy will boost the performance even of those parts of the app that don't directly benefit from cached data. Server response time will be better in general because of the reduced workload, and network bandwidth will be freed. The database will have less work to do, meaning that it responds faster and each database instance may be able to support a larger cluster of J2EE servers. However, caching poses serious design challenges, whether it is implemented by J2EE server vendor, app developer, or a third party, so it should not be used without justification in the form of test results and other solid evidence. If we implement a cache in app code, it means that we may have to write the kind of complex code, such as thread management, that J2EE promised to deliver us from. The fact that J2EE apps may be required to run in a clustered environment may add significant complexity, even if we use a third-party caching solution.

When to Cache

Before we can begin to cache data, there are several questions we should ask. Most relate to the central issues of staleness, contention, and clustering:

The Pareto Principle (the 80/20 rule) is applicable to caching. Most of the performance gain can often be achieved with a small proportion of the effort involved in tackling the more difficult caching issues.


Data caching can radically improve the performance of J2EE apps. However, caching can add much complexity and is a common cause of bugs. The difficulty of implementing different caching solutions varies greatly. Jump at any quick wins, such as caching read-only data. This adds minimal complexity, and can produce a good performance improvement. Think much more carefully about any alternatives when caching is a harder problem - for example, when it concerns read-write data.

Don't rush to implement caching with the assumption that it will be required; base caching policy on performance analysis.

A good app design, with a clean relationship between architectural tiers, will usually facilitate adding any caching required. In particular, interface-based design facilitates caching; we can easily replace any interface with a caching implementation, if business requirements are satisfied. We'll look at an example of a simple cache shortly.

Where to Cache

As using J2EE naturally produces a layered architecture, there are multiple locations where caching may occur. Some of these types of caching are implemented by the J2EE server or underlying database, and are accessible to the developer via configuration, not code. Other forms of caching must be implemented by developers, and can absorb a large part of total development effort. Let's look at choices for cache locations, beginning from the backend:

Location of cache

Implemented by

Likely performance improvement

Complexity of implementation



RDBMS vendor

Significant. However, database cached in the RDBMS is still a long way from a user of the app, especially in a distributed app.

No J2EE work required. Some database configuration may be required. Index creation is simple. We may also be able to use more efficient table types, depending on our target database.

RDBMS caching is often forgotten by J2EE developers. Most RDBMSs cache execution paths for common statements and may cache query results. RDBMS indexes amount to caching ahead of time, and can produce dramatic performance improvements. We looked at the use of PreparedStatements in app code in : this can ensure that the RDBMS can perform effective caching.
Whatever the caching our J2EE server offers, and whatever caching we implement, the database cache should still be a help.

Entity bean cache.

EJB container vendor or vendor of CMP implementation

Varies. Can be very significant if it greatly reduces the number of calls to the underlying database. However, this presumes a highly efficient entity bean implementation: in a clustered environment, the cache will need to be distributed, raising problems of transactional integrity and replication.
An entity bean cache is still a long way from the client: network round trips may be necessary to get to the EJB tier in the first place. Thus I feel that the value of entity bean caching is often overrated in distributed apps.

Nil, or very little.

The J2EE specification does not guarantee caching, meaning that an architecture that performs satisfactorily only with efficient entity bean caching is not portable. However, a third-party persistence manager might be used with multiple app servers.

Data cache in data access tier - that is, web container or EJB container - other than entity bean cache. For example, a JDO implementation or third-party O/R mapping solution such as TopLink.

Third-party vendor

Benefits similar to an entity bean cache. Also similar problems: will break in a clustered environment unless it is distributed, meaning that only high-end products will be suitable for scalable deployments.


Introduces dependence on another product besides the EJB container. However, we may have opted for a third-party O/R mapping tool for other reasons.

Session EJBs.


Depends on how expensive the retrieval of cached data was. Doesn't eliminate network round trips to the EJB container in distributed apps.


It's difficult to use the Singleton design pattern in the EJB tier, so cached data may be duplicated in stateless session bean instances, with the caches potentially in different states. However, cached data will benefit all users of a stateless session bean.
Stateful session bean s can only cache data on behalf of a single user.
The ejbCreate() method is the natural place to retrieve data. However, it is easy to use lzy loading, retrieving resources only when first required in a business method, because EJBs can be implemented as though they are single threaded. Thus there is no need to perform synchronization, or worry about race conditions. The ejbRemove() method should be used for freeing resources that can't simply be left to be garbage collected. We discussed data caching in session beans in .

Business objects running in the web container.

Developer or third-party solution

Very significant. Eliminates network round trips to the EJB container in distributed apps. Even when EJBs are collocated in the same VM, an invocation on an EJB will be slower than an invocation of a local method.


Quick wins, such as caching reference data, will produce big returns. However, careful thought is advisable before trying to implement more problematic caching solutions. The J2EE infrastructure cannot help address concurrency issues. However, we may be able to use third-party solutions, two of which are discussed below.

Web tier.

Developer or third-party solution

Very significant.


There are a host of alternatives here, such as caching custom tags, caching servlets, and caching filters. Caching filters are particularly attractive, as they enable cache settings to be managed declaratively in the web.xml deployment descriptor.

Web tier.

J2EE server vendor

Very significant.


Similar to the above, but provided by the app server vendor. Web tier caching is provided by WebSphere and iPlanet/SunONE among other servers.

Cache in front of the J2EE app server, achieved by setting HTTP cache control headers or "edge side" caching.

Developer, possibly relying on a third-party caching product

Very significant.


We'll discuss "front" caching for web solutions under Web Tier Performance Issues below.

Generally, the closer to the client we can cache, the bigger the performance improvement, especially in distributed apps. The flip side is that the closer to the client we cache, the narrower the range of scenarios that benefit from the cache. For example, if we cache the whole of an app's dynamically generated pages, response time on these pages will be extremely fast (of course, this particular optimization only works for pages that don't contain user-specific information). However, this is a "dumb"form of caching - the cache may have an obvious key for the data (probably the requested URL), but it can't understand the data it is storing, because it is mixed with presentation markup. Such a cache would be of no use to a Swing client, even if the data in the varying fragments of the cached pages were relevant to a Swing client.


J2EE standard infrastructure is really geared only to support the caching of data in entity EJBs. This option isn't available unless we choose to use entity EJBs (and there are many reasons why we might not). It's also of limited value in distributed apps, as they face as much of a problem in moving data from EJB container to remote client as in moving data from database to EJB container.

Thus we often need to implement our own caching solution, or resort to another third-party caching solution. I recommend the following guidelines for caching:

Third-party Caching Products for Use in J2EE apps

Let's look at some third-party commercial caching products that can be used in J2EE apps. The main reasons we might spend money on a commercial solution are to achieve reliable replicated caching functionality, and avoid the need to implement and maintain complex caching functionality in-house. Coherence, from Tangosol ( is a replicated caching solution, which claims even to support clusters including geographically dispersed servers. Coherence integrates with most leading app servers, includingJBoss. Coherence caches are basically alternatives to standard Java map implementations, such as java.util.HashMap, so using them merely requires Coherence-specific implementations of Java core interfaces. SpiritCache, from SpiritSoft ( is also a replicated caching solution, and claims to provide a "universal caching framework for the Java platform". The SpiritCache API is based on the proposed JCache standard API (JSR-107: JCache, proposed by Oracle, defines a standard API for caching and retrieving objects, including an event-based system allowing app code to register for notification of cache events.


Commercial caching products are likely to prove a very good investment for apps with sophisticated caching requirements, such as the need for caching across a cluster of servers. Developing and maintaining complex caching solutions in-house can prove very expensive. However, even if we use third-party products, running a clustered cache will significantly complicate app deployment, as the caching product - in addition to the J2EE app server - will need to be configured appropriately for our clustered environment.

Code Optimization

Since design largely determines performance, unless app code is particularly badly written, code optimization is seldom worth the effort inJ2EE apps unless it is targeted at known problem areas. However, all professional developers should be familiar with performance issues at code level to avoid making basic errors. For discussion of Java performance in general, I recommend Java Performance Tuning by Jack Shirazi from Oracle () and Java 2 Performance and Idiom Guide form Prentice Hall, (). There are also many good online resources on performance tuning. Shirazi maintains a performance tuning web site ( that contains an exhaustive directory of code tuning tips from many sources.


Avoid code optimizations that reduce maintainability unless there is an overriding performance imperative. Such "optimizations" are not just a one-off effort, but are likely to prove an ongoing cost and cause of bugs.

The higher-level the coding issue, the bigger the potential performance gain by code optimization. Thus there often is potential to achieve good results by techniques such as reordering the steps of an algorithm, so that expensive tasks are executed only if absolutely essential. As with design, an ounce of prevention is worth a pound of cure. While obsession with performance is counter-productive, good programmers don't write grossly inefficient code that will later need optimization. Sometimes, however, it does make sense to try a simple algorithm first, and change the implementation to use a faster but more complex algorithm only if it proves necessary. Really low-level techniques such as loop unrolling are unlikely to bring any benefit to J2EE systems. Any optimization should be targeted, and based on the results of profiling. When looking at profiler output, concentrate on the slowest five methods; effort directed elsewhere will probably be wasted. The following table lists some potential code optimizations (worthwhile and counter-productive), to illustrate some of the tradeoffs between performance and maintainability to be considered:


Performance improvement

Effect on maintainability

Minimize object creation, through techniques such as object pooling and "canonicalizing" objects (preventing the creation of multiple objects representing the same value).

Varies. May reduce the work of garbage collection. The performance benefit may not be very great with the sophisticated garbage collection of newer VMs.

Implementing such algorithms may be complex. Code may become harder to read.
This is the kind of performance issue that should be kept in mind when writing code in the first place. We shouldn't create large numbers of objects without good reason if there is an alternative, such as using primitives.

Use the correct collection type: for example, java.util.Linke dList when we don't know how many elements to expect, and when one will be added at a time, or java.util.Array List when we know how many elements to expect. Remember all those data structures modules in Computer Science I? Sun have implemented many of the standard data structures as core library collections, meaning we just need to choose the most appropriate.

Varies. May be very significant if the list grows unpredictably or requires sorting.

None. We should access the Collection through its interface (such as java.util.List) rather than concrete class (such as java.util.LinkedList).

Use an exception rather than a check to end a loop.

Varies with virtual machine.

Likely to make code harder to read. This is an example of an optimization that should be avoided if possible.

Use final classes and methods.


It's often good style to use final classes and methods, so we often use this "optimization" for other reasons (see ).

Avoid using System.out.

Significant if a lot of output is involved.

In any case, it's vital that an enterprise app uses a proper logging framework. This is discussed in .

Avoid evaluating unnecessary conditions. Java guarantees "short-circuit" evaluation of ands and ors. Thus we should perform the quickest checks first, potentially avoiding the need to evaluate slower checks.

Can be significant if a piece of code is frequently invoked.


Avoid operations on Strings, using StringBuffers in preference. As Strings are immutable, String operations are likely to be inefficient and wasteful, resulting in the creation of many short-lived objects.

May be significant, depending on the JVM.

Due to the significant performance benefit, this is a case where a professional developer should simply get used to reading the slightly more verbose StringBuffer syntax.
Note that the HotSpot JVM in Sun's JDK 1.3 appears to perform such optimization automatically; however, it's best not to rely on this.

Avoid unnecessary String or StringBuffer operations. Even StringBuffer operations are relatively slow.


None. This is an area where we can safely achieve quick wins.

Minimize the use of interfaces, as they may be slower to invoke than classes.

Very slight.

This is the kind of "optimization" that has the potential to wreck a codebase. The marginal performance gain isn't worth the damage this approach can wreak.

String and StringBuffer operations can have a big impact on performance. Even StringBuffer operations are surprisingly expensive, as use of a profiler such asJProbe quickly demonstrates. Be very aware of string operations in heavily used code, making sure they are necessary and as efficient as possible. As an example of this, consider logging in our sample app. The following seemingly innocent statement in our TicketController web controller, performed only once, accounts for a surprisingly high 5% of total execution time if a user requests information about a reservation already held in their session:

 logger.fine("Reservation request is [" + reservationRequest + " ] ");

The problem is not the logging statement itself, but that of performing a string operation (which HotSpot optimizes to a StringBuffer operation) and invoking the toString() method on the ReservationRequest object, which performs several further string operations. Adding a check as to whether the log message will ever be displayed, to avoid creating it if it won't be, will all but eliminate this cost in production, as any good logging package provides highly efficient querying of log configuration:

 if (logger.isLoggable(Level.FINE))
 logger.fine("Reservation request is [" + reservationRequest + "]");

Of course a 5% performance saving is no big deal in most cases, but such careless use of logging can be much more critical in frequently-invoked methods. Such conditional logging is essential in heavily used code.


Generating log output usually has a minor impact on performance. However, building log messages unnecessarily, especially if it involves unnecessary toString() invocations, can be surprisingly expensive.

Two particularly tricky issues are synchronization and reflection. These are potentially important, because they sit midway between design and implementation. Let's take a closer look at each in turn. Correct use of synchronization is an issue of both design and coding. Excessive synchronization throttles performance and has the potential to deadlock. Insufficient synchronization can cause state corruption. Synchronization issues often arise when implementing caching. The essential reference on Java threading is Concurrent Programming in Java: Design Principles and Patterns from Oracle (). I strongly recommend referring to this tutorial when implementing any complex multi-threaded code. However, the following tips may be useful:

Reflection has a reputation for being slow. Reflection is central to much J2EE functionality and a powerful tool in writing generic Java code, so it's worth taking a close look at the performance issues involved. It reveals that most of the fear surrounding the performance of reflection is unwarranted. To illustrate this, I ran a simple test to time four basic reflection operations:

The source code for the test can be found in the sample app download, under the path /framework/test/reflection/Tests.Java. The following method was invoked via reflection:

 public String foo(int i) {
 return "This is a string with a number " + i + " in it";

The most important results, in running these tests concurrently on a IGhz Pentium III under JDK 1.3.1_02, were:

My conclusions, from this and tests I have run in the past, and experience from developing real apps, are that:


The assumption among many Java developers that "reflection is slow" is misguided, and becoming increasingly anachronistic with maturing JVMs. Avoiding reflection is pointless except in unusual circumstances - for example, in a deeply nested loop. Appropriate use of reflection has many benefits, and its performance overhead is nowhere near sufficient to justify avoiding it. Of course app code will normally use reflection only via an abstraction provided by infrastructure code.