Strategic Issues and Definitions

High-level design largely determines the efficiency of a J2EE app. Code-level optimization can help to eliminate known bottlenecks, but will seldom produce a comparable return on effort. J2EE provides a wealth of design choices, at many levels. Perhaps most important is the question of whether to use a distributed architecture. If we do adopt a distributed architecture, app partitioning - the question of which components run where - will largely determine performance and throughput. As J2EE offers a wide variety of component types, it's also important to consider the performance implications when modeling our apps. For example, should we perform a particular data operation using entity beans, or using JDO or JDBC? Do we really need to use EJB, which delivers many services but adds overhead? We must balance performance considerations with other issues. For example, should we use high performing but non-portable features of our initial target database? Should we complicate design and implementation by adding our own caching services or using third-party caching software not supplied without our app server? Such decisions may involve tradeoffs. Performance considerations will often be critical to decision making, but good design and a maintainable codebase remain essential. Fortunately, as we'll see, there's often no need to make sacrifices to achieve good performance: good design often leads to good performance, and provides the flexibility essential to implement caching and other worthwhile optimizations. It's essential to remember that satisfactory performance is a critical business requirement. If an app doesn't deliver adequate performance, it doesn't matter how elegant it is or how many design patterns it uses. It's a failure, because it doesn't satisfy the needs of the business. Too many J2EE developers (and too much J2EE literature!) seem to forget this, leading to inflexible pursuit of inherently inefficient designs. Due to the risks to a project of unsatisfactory performance or insufficient throughput, we should consider performance implications throughout the project lifecycle. We should be prepared to run performance and load tests as development progresses. For example, we may need to implement a "vertical slice" early in the project lifecycle to verify that the design will result in satisfactory performance before we are too heavily committed.


Performance and throughput should be tested as early as possible in a project lifecycle through implementing a "vertical slice" or "spike solution". If such metrics are left until an implementation is functionally complete, rectifying any problems may require major changes to design and code.

Often the vertical slice will implement some of the functionality that we suspect will have the worst performance in the app and some of the functionality that will be most heavily used in production. However, we should try to back suspicions about likely performance problems with evidence. It's not always possible to predict where problem areas will be.

Performance and Scalability

Performance and scalability are quite distinct. When we speak of an app's performance, we are usually referring to the time taken to execute its key use cases on a given hardware configuration. For example, we might measure the performance of an app running in a single app server instance on a server on which it will be deployed in production. A performant app appears fast to its users. However, performance doesn't necessarily measure the app's ability to cope with increased load, or the app's ability to take advantage of a more powerful hardware configuration. This is the issue of scalability. Scalability is often measured in terms of the number of concurrent users that can be served by the app, or transaction throughput. Scalability encompasses scalability on a given hardware configuration (for example, the maximum number of concurrent users that can be handled by a given server), and maximum scalability (the absolute maximum throughput that can be achieved in any hardware deployment). Achieving maximum scalability will require the hardware configuration to be enhanced as necessary as the load increases. Usually, a growing cluster of servers will underpin scalability. Maximum scalability will usually be limited in practice, as the overhead of running a cluster will grow with the cluster's size, meaning that return from adding additional servers will diminish. A special case of scalability concerns data volume. For example, a recruitment system might support user searches efficiently with 500 jobs in its database, covering one industry in one city. Will it still function efficiently with 100,000 jobs, covering many industries across several countries? Performance and scalability are not only different concepts - they're sometimes opposed in practice. apps that perform badly are also likely to have poor scalability. An app that thrashes a server's CPU handling the requirements of 5 users is hardly likely to satisfy 200. However, apps can be highly performant under light load, yet exhibit poor scalability. Scalability is a challenge that can expose many flaws in an app, some very subtle. Potential problems include excessive synchronization blocking concurrent threads (a problem that might not be apparent under light load, when the app might appear very responsive), leaking of resources, excessive memory use, inefficient database access overloading database servers, and inability to run efficiently in a cluster. Failure to scale to meet demand will have grave consequences, such as spiraling response times and unreliability. Both performance and scalability are likely to be important business requirements. For example, in the case of a web site, poor performance may drive users away on their first visit. Poor scalability will eventually prove fatal if the site does succeed in attracting and retaining users.

Adopting J2EE does not allow us, as app architects and developers, to assume that performance and scalability requirements have been taken care of for us by the gurus at BEA or IBM. If we do, we're in for a nasty surprise. While the J2EE infrastructure can help us to achieve scalable and performant apps if used correctly, it also adds overhead, which can severely impact the performance of naively implemented apps. To ensure that our apps are efficient and scalable, we must understand some of the overheads and implementation characteristics of J2EE app servers.

Setting Clear Goals for Performance and Scalability

It's important to be precise about what we want to achieve with performance and scalability. Scalability is likely to be essential. If an app can't meet the demands of the user community - or growth in the user community - it will be judged a failure. However, performance may involve more of a tradeoff. Performance may need to be balanced against other considerations such as extensibility, portability, and the likely cost of maintenance. We may even need to trade off within the performance domain. For example, some use cases may be critical to the app's acceptance, whereas others are less important. This may have implications for the app's design. Clearly such tradeoffs, determining the way in which an app should attempt to meet its business requirements, should be made up front.


It's important to have specific performance and scalability requirements for apps, rather than the vague idea that an app should "run fast and cope with many users". Non-functional requirements should contain targets for performance and throughput that are clear enough to enable testing to verify that they can be met at any point in the project lifecycle.

Design Versus Code Optimization

Major design issues in J2EE apps are likely to have a performance impact far outweighing any code-level optimizations. For example, making two remote calls in place of one in a time-critical operation in a distributed app because of poor design may impact on performance far more severely than can be addressed by any amount of code optimization. It's important to minimize the need for code-level optimization, for the following reasons:

Let's consider each of these points in turn, as they apply to J2EE development. Few things in coding are harder than optimizing existing code. Unfortunately, this is why optimization is uniquely satisfying to any programmer's ego. The problem is that the resources devoted to such optimization may well be wasted. Most programmers are familiar with the 80/20 rule (or variants), which states that 80% of execution time is spent in 20% of an app's codebase (this is also known as the Pareto Principle). The ratio varies between apps - 90/10 may be more common - but the principle is borne out empirically. This means that optimizing the wrong 80 or 90% of an app's codebase is a waste of time and resources. It is possible to identify the bottlenecks in an app and address specific problems (it is best to use tools, rather than gut feeling, to identify bottlenecks; we'll discuss the necessary tools below). Optimizing code that isn't proving to be a problem will do more harm than good. In the J2EE context, optimization may be especially pointless, as bottlenecks may be in app server code, invoked too heavily because of poor app design. Remember that much of the code that executes when a J2EE app runs is part of the app server, not the app. Optimization is a common cause of bugs. The legendary Donald Knuth, author of The Art of Computer Programming, has stated, "We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil." If optimization makes code harder to understand (and many optimizations do), it's a serious worry. Optimization causes bugs directly, when something that once worked slowly, works only some of the time, but faster. Its effect on quality is more insidious: code that is hard to understand is easy to break and will consume excessive resources forever in the course of app maintenance. Any tradeoff between performance and maintainability is critical in J2EE apps, because enterprise apps tend to be mission critical and because of the ongoing investment they represent. There is no conflict between designing for performance and maintainability, but optimization may be more problematic. In the event of a conflict, it may be appropriate to sacrifice performance for maintainability. The fact that we choose to write software in Java, rather than C or assembly language, is proof of this. Since maintenance accounts for most of the cost of a software project, simply buying faster hardware may prove cheaper than squeezing the last ounce of performance out of an app, if this makes it harder to maintain. Of course, there are situations where performance outweighs other considerations.


Minimize the need for optimization by heading off problems with good design. Optimize reluctantly and choose what to optimize based on hard evidence such as profiler results.

Some authorities on optimization recommend writing a simple program first, and then optimizing it. This approach will usually only work satisfactorily in J2EE within components, after component interfaces have been established. It is often a good approach to provide a quick implementation of an interface without concern as to performance, and to optimize it later if necessary. If we attempt to apply the "optimize later" approach to J2EE development as a whole, we're in for disappointment; where distributed J2EE apps are concerned, we're heading for disaster. The single area in which we can achieve significant performance improvement in J2EE apps after we've built a system is caching. However, as we shall see, caching can only be added with ease if an app's design is sound.