Once we know where our performance problems are, we can look at addressing them. The earlier they are brought to light, the better our chance of eliminating them without the need for major design changes or reworking existing code. There are many techniques for addressing performance or scalability problems. In this section we'll consider some of the most useful.
Before looking at design modifications or code changes, we should ensure that the cause of the problem isn't external to the app. Choice of app server and server configuration will have a vital influence on app performance. It's vital to tune the app server to meet the needs of the apps it runs. Unless your code is particularly bad, this will usually produce a better return than optimizing code, at no cost in maintainability. The performance tuning parameters available differ between app servers, but they typically include the following, some of which we've discussed in previous chapters:
Thread pool size. This will affect both the web container and EJB container. Too many threads will result in the JVM running into operating system limitations on how many threads can efficiently run concurrently; too few threads will result in unnecessary throttling of throughput.
Tuning JVM options and operating system configuration is another productive area. Usually the tuning should be that appropriate to the app server, rather than any particular app running on it, so the best place to look for such information is in documentation supplied with your app server. Important JVM options include initial and maximum heap size and garbage-collection parameters. See http://java.oracle.com/docs/hotspot/gc/ for detailed information on Sun 1.3 JVMs. It may also be possible to disable unused app server services, to make more memory available to apps and eliminate unnecessary background thread activity. app server vendors tend to produce good guidelines on performance tuning and J2EE performance in general. Some good resources are:
Database configuration is equally important, and requires specialist skills.
While J2EE app servers provide many valuable services, most of these services are not free. Using them unnecessarily may harm performance. Sometimes by avoiding or eliminating the use of unnecessary container services we can improve performance. For example, we can:
Avoid unnecessary use of EJB. All EJB invocations - even local invocations - carry an overhead of container interception. Thus the use of EJB where it doesn't deliver real value can reduce performance.
By far the most important performance gain is likely to bein avoiding unnecessary remote invocation; we discuss thisfurther below.
A simpler design is often a more performant design, and often leads to no loss in scalability.
One of the most important techniques to improve performance in J2EE apps is caching: storing data that is expensive to retrieve to enable it to be returned quickly to clients, without further retrieval from the original source. Caching can be done at many points in a J2EE architecture, but is most beneficial when it results in one architectural tier being able to avoid some calls to the tier beneath it. Caching can produce enormous performance gains in distributed apps, by eliminating remote calls. In all apps it can avoid calls from the app server to the database, which will probably involve network round trips as well as the overhead of JDBC. A successful caching strategy will boost the performance even of those parts of the app that don't directly benefit from cached data. Server response time will be better in general because of the reduced workload, and network bandwidth will be freed. The database will have less work to do, meaning that it responds faster and each database instance may be able to support a larger cluster of J2EE servers. However, caching poses serious design challenges, whether it is implemented by J2EE server vendor, app developer, or a third party, so it should not be used without justification in the form of test results and other solid evidence. If we implement a cache in app code, it means that we may have to write the kind of complex code, such as thread management, that J2EE promised to deliver us from. The fact that J2EE apps may be required to run in a clustered environment may add significant complexity, even if we use a third-party caching solution.
Before we can begin to cache data, there are several questions we should ask. Most relate to the central issues of staleness, contention, and clustering:
How slow is it to get the data without caching it? Will introducing a cache improve the performance of the app enough to justify the additional complexity? Caching to avoid network round trips is most likely to be worthwhile. However, as caching usually adds complexity, we shouldn't implement caching unless it's necessary.
The Pareto Principle (the 80/20 rule) is applicable to caching. Most of the performance gain can often be achieved with a small proportion of the effort involved in tackling the more difficult caching issues.
Important |
Data caching can radically improve the performance of J2EE apps. However, caching can add much complexity and is a common cause of bugs. The difficulty of implementing different caching solutions varies greatly. Jump at any quick wins, such as caching read-only data. This adds minimal complexity, and can produce a good performance improvement. Think much more carefully about any alternatives when caching is a harder problem - for example, when it concerns read-write data. Don't rush to implement caching with the assumption that it will be required; base caching policy on performance analysis. |
A good app design, with a clean relationship between architectural tiers, will usually facilitate adding any caching required. In particular, interface-based design facilitates caching; we can easily replace any interface with a caching implementation, if business requirements are satisfied. We'll look at an example of a simple cache shortly.
As using J2EE naturally produces a layered architecture, there are multiple locations where caching may occur. Some of these types of caching are implemented by the J2EE server or underlying database, and are accessible to the developer via configuration, not code. Other forms of caching must be implemented by developers, and can absorb a large part of total development effort. Let's look at choices for cache locations, beginning from the backend:
Location of cache |
Implemented by |
Likely performance improvement |
Complexity of implementation |
Notes |
---|---|---|---|---|
Database. |
RDBMS vendor |
Significant. However, database cached in the RDBMS is still a long way from a user of the app, especially in a distributed app. |
No J2EE work required. Some database configuration may be required. Index creation is simple. We may also be able to use more efficient table types, depending on our target database. |
RDBMS caching is often forgotten by J2EE developers. Most RDBMSs cache execution paths for common statements and may cache query results. RDBMS indexes amount to caching ahead of time, and can produce dramatic performance improvements. We looked at the use of PreparedStatements in app code in : this can ensure that the RDBMS can perform effective caching.
|
Entity bean cache. |
EJB container vendor or vendor of CMP implementation |
Varies. Can be very significant if it greatly reduces the number of calls to the underlying database. However, this presumes a highly efficient entity bean implementation: in a clustered environment, the cache will need to be distributed, raising problems of transactional integrity and replication.
|
Nil, or very little. |
The J2EE specification does not guarantee caching, meaning that an architecture that performs satisfactorily only with efficient entity bean caching is not portable. However, a third-party persistence manager might be used with multiple app servers. |
Data cache in data access tier - that is, web container or EJB container - other than entity bean cache. For example, a JDO implementation or third-party O/R mapping solution such as TopLink. |
Third-party vendor |
Benefits similar to an entity bean cache. Also similar problems: will break in a clustered environment unless it is distributed, meaning that only high-end products will be suitable for scalable deployments. |
Little. |
Introduces dependence on another product besides the EJB container. However, we may have opted for a third-party O/R mapping tool for other reasons. |
Session EJBs. |
Developer |
Depends on how expensive the retrieval of cached data was. Doesn't eliminate network round trips to the EJB container in distributed apps. |
Little-to-Moderate. |
It's difficult to use the Singleton design pattern in the EJB tier, so cached data may be duplicated in stateless session bean instances, with the caches potentially in different states. However, cached data will benefit all users of a stateless session bean.
|
Business objects running in the web container. |
Developer or third-party solution |
Very significant. Eliminates network round trips to the EJB container in distributed apps. Even when EJBs are collocated in the same VM, an invocation on an EJB will be slower than an invocation of a local method. |
Moderate-to-high. |
Quick wins, such as caching reference data, will produce big returns. However, careful thought is advisable before trying to implement more problematic caching solutions. The J2EE infrastructure cannot help address concurrency issues. However, we may be able to use third-party solutions, two of which are discussed below. |
Web tier. |
Developer or third-party solution |
Very significant. |
Moderate-to-high. |
There are a host of alternatives here, such as caching custom tags, caching servlets, and caching filters. Caching filters are particularly attractive, as they enable cache settings to be managed declaratively in the web.xml deployment descriptor. |
Web tier. |
J2EE server vendor |
Very significant. |
Little-to-moderate. |
Similar to the above, but provided by the app server vendor. Web tier caching is provided by WebSphere and iPlanet/SunONE among other servers. |
Cache in front of the J2EE app server, achieved by setting HTTP cache control headers or "edge side" caching. |
Developer, possibly relying on a third-party caching product |
Very significant. |
Little. |
We'll discuss "front" caching for web solutions under Web Tier Performance Issues below. |
Generally, the closer to the client we can cache, the bigger the performance improvement, especially in distributed apps. The flip side is that the closer to the client we cache, the narrower the range of scenarios that benefit from the cache. For example, if we cache the whole of an app's dynamically generated pages, response time on these pages will be extremely fast (of course, this particular optimization only works for pages that don't contain user-specific information). However, this is a "dumb"form of caching - the cache may have an obvious key for the data (probably the requested URL), but it can't understand the data it is storing, because it is mixed with presentation markup. Such a cache would be of no use to a Swing client, even if the data in the varying fragments of the cached pages were relevant to a Swing client.
Important |
J2EE standard infrastructure is really geared only to support the caching of data in entity EJBs. This option isn't available unless we choose to use entity EJBs (and there are many reasons why we might not). It's also of limited value in distributed apps, as they face as much of a problem in moving data from EJB container to remote client as in moving data from database to EJB container. |
Thus we often need to implement our own caching solution, or resort to another third-party caching solution. I recommend the following guidelines for caching:
Avoid caching unless it involves reference data (in which case it's simple to implement) or unless performance clearly requires it. In general, distributed apps are much more likely to need to implement data caching than collocated apps.
Let's look at some third-party commercial caching products that can be used in J2EE apps. The main reasons we might spend money on a commercial solution are to achieve reliable replicated caching functionality, and avoid the need to implement and maintain complex caching functionality in-house. Coherence, from Tangosol (http://www.tangosol.com/products-clustering.jsp) is a replicated caching solution, which claims even to support clusters including geographically dispersed servers. Coherence integrates with most leading app servers, includingJBoss. Coherence caches are basically alternatives to standard Java map implementations, such as java.util.HashMap, so using them merely requires Coherence-specific implementations of Java core interfaces. SpiritCache, from SpiritSoft (http://www.spiritsoft.net/products/jms-jcache/overview.html) is also a replicated caching solution, and claims to provide a "universal caching framework for the Java platform". The SpiritCache API is based on the proposed JCache standard API (JSR-107: http://jcp.org/jsr/detail/107.jsp). JCache, proposed by Oracle, defines a standard API for caching and retrieving objects, including an event-based system allowing app code to register for notification of cache events.
Important |
Commercial caching products are likely to prove a very good investment for apps with sophisticated caching requirements, such as the need for caching across a cluster of servers. Developing and maintaining complex caching solutions in-house can prove very expensive. However, even if we use third-party products, running a clustered cache will significantly complicate app deployment, as the caching product - in addition to the J2EE app server - will need to be configured appropriately for our clustered environment. |
Since design largely determines performance, unless app code is particularly badly written, code optimization is seldom worth the effort inJ2EE apps unless it is targeted at known problem areas. However, all professional developers should be familiar with performance issues at code level to avoid making basic errors. For discussion of Java performance in general, I recommend Java Performance Tuning by Jack Shirazi from Oracle () and Java 2 Performance and Idiom Guide form Prentice Hall, (). There are also many good online resources on performance tuning. Shirazi maintains a performance tuning web site (http://www.javaperformancetuning.com/) that contains an exhaustive directory of code tuning tips from many sources.
Important |
Avoid code optimizations that reduce maintainability unless there is an overriding performance imperative. Such "optimizations" are not just a one-off effort, but are likely to prove an ongoing cost and cause of bugs. |
The higher-level the coding issue, the bigger the potential performance gain by code optimization. Thus there often is potential to achieve good results by techniques such as reordering the steps of an algorithm, so that expensive tasks are executed only if absolutely essential. As with design, an ounce of prevention is worth a pound of cure. While obsession with performance is counter-productive, good programmers don't write grossly inefficient code that will later need optimization. Sometimes, however, it does make sense to try a simple algorithm first, and change the implementation to use a faster but more complex algorithm only if it proves necessary. Really low-level techniques such as loop unrolling are unlikely to bring any benefit to J2EE systems. Any optimization should be targeted, and based on the results of profiling. When looking at profiler output, concentrate on the slowest five methods; effort directed elsewhere will probably be wasted. The following table lists some potential code optimizations (worthwhile and counter-productive), to illustrate some of the tradeoffs between performance and maintainability to be considered:
Technique |
Performance improvement |
Effect on maintainability |
---|---|---|
Minimize object creation, through techniques such as object pooling and "canonicalizing" objects (preventing the creation of multiple objects representing the same value). |
Varies. May reduce the work of garbage collection. The performance benefit may not be very great with the sophisticated garbage collection of newer VMs. |
Implementing such algorithms may be complex. Code may become harder to read.
|
Use the correct collection type: for example, java.util.Linke dList when we don't know how many elements to expect, and when one will be added at a time, or java.util.Array List when we know how many elements to expect. Remember all those data structures modules in Computer Science I? Sun have implemented many of the standard data structures as core library collections, meaning we just need to choose the most appropriate. |
Varies. May be very significant if the list grows unpredictably or requires sorting. |
None. We should access the Collection through its interface (such as java.util.List) rather than concrete class (such as java.util.LinkedList). |
Use an exception rather than a check to end a loop. |
Varies with virtual machine. |
Likely to make code harder to read. This is an example of an optimization that should be avoided if possible. |
Use final classes and methods. |
Slight. |
It's often good style to use final classes and methods, so we often use this "optimization" for other reasons (see ). |
Avoid using System.out. |
Significant if a lot of output is involved. |
In any case, it's vital that an enterprise app uses a proper logging framework. This is discussed in . |
Avoid evaluating unnecessary conditions. Java guarantees "short-circuit" evaluation of ands and ors. Thus we should perform the quickest checks first, potentially avoiding the need to evaluate slower checks. |
Can be significant if a piece of code is frequently invoked. |
None. |
Avoid operations on Strings, using StringBuffers in preference. As Strings are immutable, String operations are likely to be inefficient and wasteful, resulting in the creation of many short-lived objects. |
May be significant, depending on the JVM. |
Due to the significant performance benefit, this is a case where a professional developer should simply get used to reading the slightly more verbose StringBuffer syntax.
|
Avoid unnecessary String or StringBuffer operations. Even StringBuffer operations are relatively slow. |
Significant. |
None. This is an area where we can safely achieve quick wins. |
Minimize the use of interfaces, as they may be slower to invoke than classes. |
Very slight. |
This is the kind of "optimization" that has the potential to wreck a codebase. The marginal performance gain isn't worth the damage this approach can wreak. |
String and StringBuffer operations can have a big impact on performance. Even StringBuffer operations are surprisingly expensive, as use of a profiler such asJProbe quickly demonstrates. Be very aware of string operations in heavily used code, making sure they are necessary and as efficient as possible. As an example of this, consider logging in our sample app. The following seemingly innocent statement in our TicketController web controller, performed only once, accounts for a surprisingly high 5% of total execution time if a user requests information about a reservation already held in their session:
logger.fine("Reservation request is [" + reservationRequest + " ] ");
The problem is not the logging statement itself, but that of performing a string operation (which HotSpot optimizes to a StringBuffer operation) and invoking the toString() method on the ReservationRequest object, which performs several further string operations. Adding a check as to whether the log message will ever be displayed, to avoid creating it if it won't be, will all but eliminate this cost in production, as any good logging package provides highly efficient querying of log configuration:
if (logger.isLoggable(Level.FINE)) logger.fine("Reservation request is [" + reservationRequest + "]");
Of course a 5% performance saving is no big deal in most cases, but such careless use of logging can be much more critical in frequently-invoked methods. Such conditional logging is essential in heavily used code.
Important |
Generating log output usually has a minor impact on performance. However, building log messages unnecessarily, especially if it involves unnecessary toString() invocations, can be surprisingly expensive. |
Two particularly tricky issues are synchronization and reflection. These are potentially important, because they sit midway between design and implementation. Let's take a closer look at each in turn. Correct use of synchronization is an issue of both design and coding. Excessive synchronization throttles performance and has the potential to deadlock. Insufficient synchronization can cause state corruption. Synchronization issues often arise when implementing caching. The essential reference on Java threading is Concurrent Programming in Java: Design Principles and Patterns from Oracle (). I strongly recommend referring to this tutorial when implementing any complex multi-threaded code. However, the following tips may be useful:
Don't assume that synchronization will always prove disastrous for performance. Base decisions empirically. Especially if operations executed under synchronization execute quickly, synchronization may ensure data integrity with minimal impact on performance. We'll look at a practical example of the issues relating to synchronization later in this chapter.
Reflection has a reputation for being slow. Reflection is central to much J2EE functionality and a powerful tool in writing generic Java code, so it's worth taking a close look at the performance issues involved. It reveals that most of the fear surrounding the performance of reflection is unwarranted. To illustrate this, I ran a simple test to time four basic reflection operations:
Loading a class by name with the Class.forName (String) method. The cost of invoking this method depends on whether the requested class has already been loaded. Any operation - using reflection or not - will be much slower if it requires a class to be loaded for the first time.
The source code for the test can be found in the sample app download, under the path /framework/test/reflection/Tests.Java. The following method was invoked via reflection:
public String foo(int i) { return "This is a string with a number " + i + " in it"; }
The most important results, in running these tests concurrently on a IGhz Pentium III under JDK 1.3.1_02, were:
10,000 invocations this method via Method.invoke() took 480ms.
My conclusions, from this and tests I have run in the past, and experience from developing real apps, are that:
Invoking a method using reflection is very fast once a reference to the Method object is available. When using reflection, try to cache the results of introspection if possible. Remember that a method can be invoked on any object of the declaring class. If the method does any work at all, the cost of this work is likely to outweigh the cost of reflective invocation.
Important |
The assumption among many Java developers that "reflection is slow" is misguided, and becoming increasingly anachronistic with maturing JVMs. Avoiding reflection is pointless except in unusual circumstances - for example, in a deeply nested loop. Appropriate use of reflection has many benefits, and its performance overhead is nowhere near sufficient to justify avoiding it. Of course app code will normally use reflection only via an abstraction provided by infrastructure code. |