Testing the " Right " Persistence - Java Programming Language

So far we have seen that it is desirable and possible to free the persistence interface from dependence upon a specific persistence implementation. This means that we can simplify and accelerate the tests for the underlying code significantly. However, we are not free from our obligation to test the actual implementation of the persistence interface. Of course, this means that we will encounter the same problems we had to deal with in our first naive version of DailyReportTest (see ), with one exception. It is now sufficient to test the expected persistence functionality only once; we don't have to do the same things over and over again for all higher-level tests. Among other things, we have to test for the following:

Can I create, modify, and delete objects?
Will the query methods supply the correct results?
Will violations of domain constraints be prevented?
Does the transactional behavior work?

Let's look at a few examples from this catalog. The first example is a test for the creation of persistent customer objects, including the setUp() and tearDown() code parts:

public class CRMDatabaseTest extends TestCase {
 private CRMDatabase database;
 protected void setUp() throws Exception {
 database = new CRMDatabase("jdbc:odbc:CRM");
 }
 protected void tearDown() throws Exception {
 if (database.isConnected()) {
 Iterator i = database.allCategories().iterator();
 while (i.hasNext()) {
 CustomerCategory each =
 (CustomerCategory) i.next();
 this.deleteCategoryAndDependentCustomers(each);
 }
 database.shutdown();
 }
 }
 private void deleteCategoryAndDependentCustomers(
 CustomerCategory category) throws CRMException {
 Iterator i = database.allCustomers(category).iterator();
 while (i.hasNext()) {
 Customer each = (Customer) i.next();
 database.deleteCustomer(each);
 }
 database.deleteCategory(category);
 }
 public void testCustomerCreation() throws Exception {
 CustomerCategory cat = database.createCategory("cat");
 Customer customer1 =
 database.createCustomer("customer1", cat);
 Customer retrieved1 =
 database.getCustomer(customer1.getId());
 assertEquals(customer1, retrieved1);
 assertEquals(customer1.getName(), retrieved1.getName());
 assertEquals(customer1.getCategory(),
 retrieved1.getCategory());
 Customer customer2 =
 database.createCustomer("customer2", cat);
 Customer retrieved2 =
 database.getCustomer(customer2.getId());
 assertEquals(customer2, retrieved2);
 Set allCustomers = database.allCustomers(cat);
 assertEquals(2, allCustomers.size());
 assertTrue(allCustomers.contains(customer1));
 assertTrue(allCustomers.contains(customer2));
 }
}

A prerequisite for this kind of testing is an empty database-in this example an ODBC database; empty in the sense that no Customer and CustomerCategory objects exist in the visible area. If we cannot guarantee this, then the effort will be much bigger. We can see that the tearDown() code is still relatively complicated, because we want to ensure that no "dead objects" from the tests will remain. If this code fails only once, it can mean that things have to be manually deleted from the database to get back to a controllable initial state. In contrast, there is nothing really surprising about the test method testCustomerCreation(). It creates the required objects, fetches them again from the database, and verifies the attributes. The tests for deleting and overwriting are similar. The following test-this time omitting the setUp and tearDown-checks the allCustomers(...) query method:

public void testAllCustomers() throws Exception {
 CustomerCategory cat1 = database.createCategory("cat1");
 CustomerCategory cat2 = database.createCategory("cat2");
 Customer customer1 = database.createCustomer("customer1", cat1);
 Customer customer2 = database.createCustomer("customer2", cat2);
 Customer customer3 = database.createCustomer("customer3", cat1);
 Set cat1Customers = database.allCustomers(cat1);
 assertEquals(2, cat1Customers.size());
 assertTrue(cat1Customers.contains(customer1));
 assertTrue(cat1Customers.contains(customer3));
 Set cat2Customers = database.allCustomers(cat2);
 assertEquals(1, cat2Customers.size());
 assertTrue(cat2Customers.contains(customer2));
}

Nothing new, or is there? Much more interesting are unit tests to verify for correct error signaling, for example in the attempt to delete a CustomerCategory object that is still in use:

public void testCategoryDeletionWithCustomerFailure()
 throws Exception {
 CustomerCategory cat = database.createCategory("Category 1");
 Customer cust = database.createCustomer("customer1", cat);
 try {
 database.deleteCategory(cat);
 fail("CRMException expected");
 } catch (CRMException expected) {}
 database.deleteCustomer(cust);
 database.deleteCategory(cat);
}

This is another test case that corresponds to the pattern introduced in , . Note that we verify for correct continuation after the CRMException occurrence. Our last example provides for a rollback when executeTransaction() is used:

public void testExecuteTransactionRollback() throws Exception {
 CRMTransaction t = new CRMTransaction() {
 public Object run() throws CRMException {
 database.createCategory("cat1");
 // should fail and rollback:
 database.createCategory("cat1");
 return database.allCategories();
 }
 };
 try {
 database.executeTransaction(t);
 fail("CRMException should have been thrown");
 } catch (CRMException expected) {}
 assertTrue(database.allCategories().isEmpty());
}

All tests shown in the preceding examples, except for the creation of a database instance, are independent of the type of the underlying implementation. Instead of directly connecting to an RDBMS via JDBC, we could alternatively use object-relational mapping tools, object-oriented databases, or persistence mechanisms based on serialization. Depending on the implementation, we will need additional tests, for example to check for correct implementation of caching mechanisms. In addition, we haven't considered concurrent access to the database in several threads yet. , Concurrent Programs, includes suggestions for this type of concurrent test. The Web site accompanying this tutorial includes a complete test suite for a simple and non-concurrent JDBC implementation of the persistence interface.

Approaches for Test Data Consistency

The approach selected above to provide a persistent test fixture was pretty simple: persistent objects were created in setUp() or in the respective test method. These objects can then be modified, deleted, fetched, or otherwise manipulated within the test. Finally, we ensured in the tearDown() method that all created objects will eventually be deleted. This approach works well when the addressed (logical) database is available exclusively for a unit test. Also, if the initial state is always an empty database, then deleting objects can often be accelerated by several drop table and subsequent create table invocations. The situation is different when dealing with a database accessed by several developers, perhaps even concurrently. In this case, we have to ensure that, once a test run is completed, the database is put back into the previous state. We could extend the approach discussed here in this direction, for example by using specific IDs for specific test types, or by marking specific records as test records. In practice, all of this leads to a large number of linked dependencies, coordination problems among developers, increasingly complex tearDown code, and increasingly inconsistent databases. In fact, these databases would often have to be repaired manually or completely refreshed. [URL:Dbunit] describes difficulties arising in the cases we've discussed and suggests the use of four databases for different purposes and test types:

The production database, consisting of live data-no testing on this database.
Your local development database, which is where most of the testing is carried out.
A populated development database, possibly shared by all developers so you can run your app and see it work with realistic amounts of data, rather than the handful of records you have in your test database.
A deployment database, where the tests are run prior to deployment to make sure any local database changes have been applied.

A problem identified in the four-database approach relates to the synchronization of the data schemas. Database 2 serves for unit tests in our sense in that the prerequisite of a dedicated database is given.

Speeding Up the Test Suite

Although their number is reduced, persistent test cases still take a long time. An approach that has been successfully used by the author is to run the persistent test cases against a lightweight or in-memory database during development and to switch to the real database only for integration tests. For example HsqlDb [URL:HsqlDb] can be run in several modes, two of which follow:

In-memory mode. Holds the data submitted via SQL persistent as long as the connection is not being closed. This is useful to test classes which work on instances of java.sql.Connection.
Local mode. Writes all data to a file and retrieves it from that file. This is much faster than building a real connection and retrieving data from it as long as the amount of data is small.

This method of improving the performance of persistent tests comes with a caveat, though: Using different databases for development and deployment can hide problems connected to different variants of SQL and different capabilities of the JDBC drivers used. You should therefore have a sufficient number of persistent tests using the deployment database of choice-at least once in a while. Another way to speed things up is to start from a data repository offering preconfigured objects for all persistent test cases, instead of starting from an empty database. In this case, the attempt to reset all modified data to their initial state upon tearDown() will fail at the latest when some attributes of the preconfigured records change. Still, we can think of two variants:

We envelop the test in a transaction that is started in setUp() and rolled back in tearDown(). Note, however, that this works only provided that
- our database supports nested transactions, and we do not run any tests requiring a commit of the outermost transaction; or
- when running tests on methods not protected by transactions.
Optimistic locking strategies can lead to the problem that the final transaction commit fails under certain circumstances. Note that this type of error cannot be discovered in this way.
At times, it may be faster to load the initial test state from a database dump or by use of an SQL script instead of creating the fixture object by object. On the other hand, it will then take more effort to adapt the script or constantly recreate the dump file.

JDBC Mocks

What can we do if our persistent test suite still runs too slowly? It is "too slow" when we run it less often than necessary because of the long wait-when we are not running it at least before each integration of modified sources into the total project. Why not use mock objects in this case, too? By their nature, mock objects are implementation-specific; that is, we have to know exactly how the class CRMPersistence implements persistence. In our example, this means direct persistence (i.e., JDBC is addressed without detour over a persistence framework). A good point to use the mock approach could, therefore, be an additional constructor. Instead of the database URL, we give this constructor an instance of the java.sql.Connection type:

public class CRMDatabase implements CRMPersistence {
 public CRMDatabase(Connection connection)
 throws CRMException {
 ...
 }
 ...
}

Fortunately, most interfaces in the java.sql package can basically be "mocked" without major problems. JDBC database access functions by the following scheme: the Connection instance uses createStatement() to create a Statement object. The latter, in turn, has several methods for SQL calls, such as executeQuery (String sqlQuery), which return ResultSet instances. Such a set of results can then be iterated to determine the result lines and the individual column values. For this reason, one single MockConnection object is not sufficient to allow us meaningful testing. This connection dummy itself has to create MockStatement instances, and the latter have to create MockResultSets. This means that configuring a mock connection is anything but easy. We have to define which mock statements to supply in which order, when and how often a commit() should be sent, and so on. And we have to deal with a similar complexity for our mock statements and mock result sets. Moreover, the inter-object invocation sequence is very important for a correct implementation. For example, a ResultSet instance should no longer be used as soon as the creating Statement object was closed by close(). The following problems can result:

The actual number and order of required statements, queries, commits, and the like changes often during a refactoring process. For this reason, the effort involved in adapting mock-based tests can be very high.
The functionality of required mock objects is no longer trivial, causing a considerable development effort.
Our confidence in having the required security for refactoring steps with our mock tests is hard to achieve due to the complex state-based semantics of the JDBC interface.
We also have to validate dynamically generated SQL commands against the database.

For all these reasons, the use of mock objects in the given case holds intricate dangers, questioning the sense of this venture. Our objective to speed up unit tests for persistence can often also be achieved by using a lightweight database as described in the preceding subsection, Speeding Up the Test Suite.

Those who still want to experiment with mock objects in the JDBC environment are referred to the package com.mockobjects.eziba.sql at [URL:MockObjects]. This package can relieve you from a considerable part of the effort involved in mock implementation. At [URL:MockJDBC], Steve Freeman uses an example to show how the mock classes of this package can help in the test-first development of a JDBC program.

Evolution of the Persistence Technology

Sharp readers will probably have noticed that we haven't yet asked the question about the sense of using an SQL database. In fact, many projects begin with the explicit specification that all persistent data should be in database XYZ of vendor ZYX. On the other hand, if we can select a technology, then the use of a complex commercial database system from the outset will represent the best solution only in very few cases. Sticking tightly to the principles of the test-first development, the history of the persistence mechanism used typically looks more or less as follows:

The first requirement to persistence often consists of simple configuration data that can best be saved to a file by using Java's Properties class.
At a later point, we decide to store the state of the app as an object mesh. Java's serialization is suitable for this purpose.
At some further point in time, a new user story requires frequent and targeted saving and reading of specific objects. For this reason, we select a freely available SQL database that does not involve much administration work and use JDBC to directly write our few classes to that database.
As we progress, there will be more classes calling for persistence. Considering that manually mapping our objects to tables will be too expensive, we decide to look for an object-relational mapping tool.
New requirements demand complex transaction behavior patterns, very high throughput rates, or absolutely safe recovery capabilities of the system. And now a costly and maintenance-intensive database system may appear justified.

Of course, some sort of turnoff from this road might be possible, such as using native Java databases or selecting a commercial OODBMS. The specifics of our project is the only important thing to keep in mind. As advocates of the test-first approach, we should be careful not to be put off by the common wisdom, You simply don't question the use of an RDBMS.