This chapter showed how and to what limits the test-first approach can be used to develop multi-threaded programs. The normal behavior of asynchronous services and synchronization objects can be tested if the developers know the required patterns. This testing work is facilitated by specialized classes, like those included in the package utmj.threaded. In contrast, testing for problems that occur only sporadically (e.g., deadlocks and poor synchronization) is difficult due to the inherent nondeterminism. We managed to create a single test case that contributed to removing a faulty notify call. Nevertheless, we suspect that this was possible only because we identified the problem to be a standard error in advance. However, the problem with these tests is not that they cannot discover all concurrency errors. What test is infallible? The problem is rather that these tests are (necessarily) based on many assumptions so they may lead us to believe we are on the safe side. A concrete JVM implementation variant can then demonstrate the absurdity of this false sense of security. As bitter as it may be for the thoroughbred tester, many other things can often contribute more than expensive unit test suites to avoid concurrency errors:

Depending on the app, synchronization problems can also be discovered in extensive load tests, randomized insertions of Thread.sleep() and Thread.yield(), or by use of thread analysis tools (e.g. Sitraka's Threadalyzer [URL:Threadalyzer]). In addition, running these tests on multiprocessors (MPs) is often a quicker way to detect synchronization problems because MPs generate more thread interleavings. Some memory-model-based errors can only be seen on multiprocessor machines. [9] An innovative and interesting path towards detecting race conditions, deadlocks, and other intermittent bugs in multi-threaded Java programs has been followed by IBM's research project ConTest [URL:ConTest]. Quoting from the Web site:

ConTest transforms a Java program into a program that should behave in the same way but is more likely to exhibit concurrent bugs such as race conditions and deadlocks. This seemingly strange behavior is useful because finding bugs in the transformed app is easier. Every bug found in the transformed app is a bug in the original program.... You can just rerun your original tests but you are now more likely to find bugs. Contest ... alleviates the need to create a complex testing environment with many processors and apps, and works by instrumenting the bytecode of the app with heuristically controlled conditional sleep and yield instructions.

[8]A thorough summary of those techniques can be found in [Gomaa93]. [9]See [Lea00], section 2.2 for a discussion of these issues.