Automated Testing: A Summary

XP regards testing as central to the activity of software development. To quote Dan Rawsthorne from the afterword of Extreme Programming Installed, “XP works because it is validation-centric rather than product-centric.” Testing software continuously validates that the software works and that it meets the customer’s requirements. Automating the tests ensures that testing will in fact be continuous. Without testing, a team is just guessing that its software meets those requirements. XP cannot be done without automated testing, nor can development be done successfully without it. All software projects need to satisfy the intended customer and to be free of defects.

Tests and Refactoring

Another core XP practice is refactoring (changing existing code for simplicity, clarity, and/or feature addition). Refactoring cannot be accomplished without tests. If you don’t practice XP, you may not be refactoring religiously. Even the most stable or difficult-to-change projects require occasional modification. To do it right, programmers will have to change the existing design. That’s where automated testing comes in.

Object-oriented coding (and, to a lesser extent, other coding styles) separates interface from implementation. In theory, this means you can change the underlying logic behind a class or method, and dependent code will handle the change seamlessly. Entire tutorials have been written about this powerful abstraction. However, if in practice the programmers are scared to change the underlying logic for fear of disturbing code that interacts with the interface, then this separation might as well not exist. Comprehensive tests (run frequently) verify how the system should work and allow the underlying behavior to change freely. Any problems introduced during a change are caught by the tests. If Design A and Design B produce equivalent results when tested, the code can be migrated from one to the other freely. With testing in place, programmers refactor with confidence, the code works, and the tests prove it.

Types of Automated Tests

Unit tests are the most talked-about test in XP; however, they are only a part of the testing picture. Unit tests cooperate with integration tests, functional tests, and auxiliary tests (performance tests, regression tests, and so on) to ensure that the system works totally.

Unit Tests: JUnit

Unit tests are the first (and perhaps the most critical) line of tests in the XP repertoire. Writing a unit test involves taking a unit of code and testing everything that could possibly break. A unit test usually exercises all the methods in the public interface of a class. Good unit tests do not necessarily test every possible permutation of class behavior, nor do they test ultra-simple methods (simple accessors come to mind); rather, they provide a common-sense verification that the code unit behaves as expected. With this verification, the public interface gains meaning. This approach makes changing unit behavior easier, and also provides a convenient (and verifiable) guide to the behavior of the unit. Developers can consult a test to discover the intended use of a class or method. In XP, unit tests become part of the cycle of everyday coding. Ideally, programmers write tests before the code, and use the test as a guide to assist in implementation. The authors both work in this mode, and we find ourselves unable to live without the guidance and corrective influence of unit tests. After a unit is complete, the team adds the test to the project’s test suite. This suite of unit tests runs multiple times per day, and all the tests always pass. This sounds extreme; however, a 100 percent pass rate on unit tests is far more sane than the alternative: a piece of vital production code that does not work. (If the code isn’t vital, why is it in the project?) Verifying each class builds a higher-quality system because it ensures that the building blocks work. Unit tests also lead the way toward clean architecture. If a developer writes a test three times for the same code in different locations, laziness and irritation will compel her to move the code to a separate location.

JUnit is a lightweight testing framework written by Erich Gamma and Kent Beck (one of the chief proponents of XP). The authors based its design on SUnit, a successful and popular unit-testing framework written by Beck for Smalltalk. The simplicity of the framework lends itself to rapid adoption and extension. All the testing tools covered in this tutorial (with the exception of JMeter, a GUI tool) interact with or extend the JUnit frame.

Integration/In-Container Tests: Cactus

Unit testing covers Object X, but what about related Objects Y and Z, which together make up subsystem A? Unit tests are deliberately supposed to be isolated. A good unit test verifies that no matter what chaos reigns in the system, at least this class functions as expected. Several papers have been written (many can be found at about strategies to avoid dependencies in unit tests (the core idea is to provide mock implementations of objects upon which the tested class depends). By all means, the unit tests should be made as independent as possible. In their tutorial Extreme Programming Installed, Jeffries et al. have an interesting observation about errors that show up only in collaborations between classes; they say, “Our own experience is that we get very few of these errors. We’re guessing here, that somehow our focus on testing up front is preventing them.” They go on to admit, “When they do show up, such problems are difficult to find.” Good unit testing should indeed catch most errors, and the behavior of the entire system falls under the category of acceptance testing (also known as functional testing); however, a good test suite should verify subsystem behavior as well. Integration testing occupies the gray area between unit and acceptance testing, providing sanity-check testing that all the code cooperates and that subtle differences between expectation and reality are precisely localized. Integration tests may not always run at 100 percent (a dependency class may not be completed yet, for instance); however, their numbers should be quite high (in the 80 to 90 percent range). An important variety of integration tests is the in-container test. The J2EE development model dictates components residing in a container. Components rely on services provided by the container. Interaction with those services needs to be verified. Although some interactions can be successfully mocked-up, creating mocked implementations for all the services provided by a J2EE container would consume time and verify behavior imperfectly. Some services, such as behaviors specified by deployment descriptors, could be very difficult to test, because container implementations differ.

The Cactus framework provides access to J2EE Web containers (which in turn usually provide access to other types of containers, such as EJB containers). By allowing tests to exercise code in the container, Cactus spares developers the chore of providing extensive or difficult mock-ups (they can use the real services, instead). This approach also provides an extra measure of feedback, because the code runs in an environment that is one step closer to its production habitat. In the case of single objects that just interact with container services, in-container tests serve as quick-and-dirty unit tests.

Acceptance/Functional Tests: HttpUnit

Functional testing ensures that the whole system behaves as expected. These tests are also called acceptance tests because they verify for the customer that the system is complete. (In other words, a Web site is not done until it can log in users, display products, and allow on-line ordering.) Functional tests are daunting in some ways (they are not an immediate productivity aid like unit tests), but they are crucial to measuring progress and catching any defects that slipped past the other tests or result from unimplemented/incomplete features. Acceptance tests are written by the customer (the programmers may implement them) because they are for the customer. Unit testing verifies for the coding team that the Foo class works correctly. Acceptance tests verify for the customer (who may not know a Foo from a Bar) that their whole system works correctly. Acceptance tests are less dependent upon specific implementation: For example, during an aggressive refactoring, the team may decide they no longer need a SubCategory object. If so, the SubCategoryTest goes to execute in the Big Container in the Sky. The team modifies the integration tests (if necessary) to account for the new system structure. However, the functional tests remain unchanged, validating that the user’s experience of the catalog navigation system remains unchanged. Functional tests do not need to always run at 100 percent, but they should do so before the software is released. Functional tests often verify specific stories (an XP representation of a customer-requested feature). As such, they can track progress through a development cycle. Each test that runs represents a finished feature. Unfortunately but understandably, no one has written a universal acceptance-testing tool. JUnit can work on just about any Java class, but an acceptance-testing tool must be tailored to the needs of a specific app. For a number-crunching program, acceptance testing could be as easy as verifying inputs versus outputs. For a data-entry app, a GUI recording and playback tool might be necessary.

We chose to cover HttpUnit, a testing API that allows programmatic calls to Web resources and inspection of the responses. The framework cooperates with JUnit and exposes the underlying structure of an HTML page to allow easy verification of structural elements. (The response to show_product.jsp returns a table with product prices.) It seemed a natural fit for the focus of this tutorial, because J2EE is heavily concerned with Web components. Acceptance testing the deeper components of J2EE might not even require a special framework because of the low level of presentation logic involved.

Performance Tests: JUnitPerf and JMeter

Several types of testing exist besides basic verification of function parallel tests (verifies that a new system exactly like an old system), performance tests, validation tests (the system responds well to invalid input), and so on. Of these, performance testing is perhaps the most widely applicable. After all, the most functional system in the world won’t be worth a dime if end users give up on the software in disgust. Client-side apps that perform poorly are trouble; server-side apps that drag are emergencies. J2EE apps are usually hosted on servers handling anywhere from hundreds to thousands (and up!) of transactions per minute, so a small section of inefficient code can bring a system to its knees. In this sort of environment, performance ranks with functional (or even unit) testing in its priority.

We will not be covering performance profiling tools (critical to solving performance issues); rather, we’ll discuss the testing tools used to uncover these problems early. JUnitPerf does unit performance testing—it decorates existing JUnit tests so that fail if their running times exceed expectations. Such tests support refactoring by verifying that performance-critical code remains within expected boundaries. JMeter provides functional performance testing—measuring and graphing response times to requests sent to a remote server (Web, EJB, database, and so on). With JMeter, customers can write acceptance tests like, “The Web server will maintain a three second or better response time to requests with a 150 user simultaneous load.”