Automating Continuous Integration with AntHill

As the complexity of software development continues to grow, a project team needs to be aware of the processes and practices used in its software development methodology. Whether waterfall or Extreme Programming, all software must be compiled, integrated, and tested. In many projects, this process of compiling, integrating, and testing is performed on an ad hoc basis triggered when a team leader or manager dictates that it is time to begin pulling the new release together. However, many of projects and most of the software configuration management literature will tell you that "building" the software should take place on a nightly basis at a minimum and potentially several times a day. In this chapter, we introduce the open-source product AntHill developed by urban{code} and show how to use AntHill to fully automate the build process. Further, we will show how to use AntHill with the Petstore project.

Where to Get AntHill

AntHill is both a commercial and an open-source product. The open-source product can be obtained from the following URL: http://www.urbancode.com/projects/anthill/download.jsp. There are both binary and source download files available. For the sake of this chapter, we will assume that you are downloading the binary file.

AntHill Primer

In an effort to help the open source community and promote the idea of continuous integration, urban{code} has released a slimmed-down version of its AntHill Pro package called AntHill OS. The OS stands for open source. AntHill is a continuous integration tool designed to provide the following:

A controlled build process.
The ability to track and reproduce builds.
Signal a build status through e-mail, CM labels, etc.
Fully automate the build process, run tests, and generate metrics.

These tasks are important to the development of a successful project because without them the project pieces would be working a vacuum. If you are new to software development or haven't been a part of a medium/large project, this idea of continuous integration might seem like a foreign topic when, in fact, the idea of doing automatic nightly builds and running a smoke test after a successful build has been around for just about as long as software development itself. Consider as an example a very large project I worked on several years ago. In this project, we had roughly 50 developers working in five major areas, with most of the developers dedicated to a single component within the areas. These developers were slinging code on a daily basis. Each night the developers who thought their code was ready for the build as well as developers who were just making maintenance changes would check their code into the CM repository. At some magical time during the night, a build would be fired off on a dedicated build machine. If the build was successful, it would be automatically copied to a test machine and an informal smoke test would be executed on the code. A good deal of the fun in our project was the morning after a build failure. A build failure occurred when either the compiling/linking failed or the smoke test failed. In either case, the developer who caused the failure would receive a traveling trophy. For the most part, the embarrassment of the trophy was a joke, but there was also the serious side of knowing that the newly checked-in code had caused a failure in the system. When the failure was addressed in the morning, the build would be fired off again to make sure the code was again in a stable state. So you can see that continuous integration is important, but what are some of the benefits? First, bugs are introduced into code on a daily basis but might not be caught for weeks or months. Consider the times when code is being integrated from a variety of sources. At one point during the initial process, an entity might reveal the API of a key component to others in the team but not readily supply the code behind the interface. Since there is no code, there is no reason to compile, right? Well, what happens when the API changes and those changes aren't communicated? If we were under a strict build process, this situation would be immediately caught because the code wouldn't compile. The interface could be put through the build process by using stubs, with the stubs slowly replaced with real code. The point here is that API changes from one entity to another need to be communicated quickly, and if the project waits until integration, there will be a large amount of time needed to resolve the situation. Many projects will schedule a significant amount of time just for this integration because they know problems will exist simply because this would be the first time the different pieces of code would have been compiled together. Second, when a continuous integration system has been set up and working successfully, developers have a faster and better mechanism to verify that changes they have just made to the code are successful. We all know that developers do a good job of writing their own unit test cases and thoroughly testing their code, but in those rare cases where this doesn't occur, the night build will catch the problem before it manifests itself too far into the project. Third, the build system will be designed in such a manner that developers are able to access the most up-to-date code possible when making changes. This helps the process because developers won't make a code change to a piece of source code just lying around in their home directory. It is clear that a continuous integration system can be an important element of a software development process, but what's involved? There are many different components necessary for a successful system. The components include:

Successful build process definition
Configuration management (CM) software
Build scripts
Automated testing
Controlling server

The build process definition tells the project and its team what constitutes a successful build. The definition is likely to be different for each project, but one definition might be the following: In order for a build process to be successful, it must obtain fresh copies of all source code, compile the code without errors and warnings (yes, there are projects that allow production code to have warnings), process the results of the compile into a deployment format, deploy the newly compiled software, run a series of system tests on the code, and report the results. A configuration management system handles the safekeeping of the code. Developers are required to check out any code they wish to make changes to and then check the same code back in when they have finished. The CM system will keep track of the changes and provide a way of always being able to view and access older pieces of code. Further, the system will be able to track who made changes, so faulty source code can be traced to the person who made the coding mistake. Scripts to automate the build are important because we don't want to hire a person to "do the build" during the night. The system should be able to automatically perform the build without human intervention. Testing is important because the system isn't successful just because it was built. We don't believe in the statement, "Hooray, it compiled. Ship it!" All of the tests on the code that will be performed by the continuous integration process need to be of the automated variety. If inputs are needed, then automated testing tools might need to be employed, but there shouldn't be the need for a human to do the testing. Finally, the controlling server is a software process that pulls together each of the components listed above. It is responsible for obtaining the latest source code from the configuration management system, compiling the code, putting together an appropriate deployment package, testing the deployment, and finally generating any necessary statistics or alerts.

Now where does AntHill come into play? The answer is the controlling server. AntHill provides the process through which a comprehensive continuous integration system can be built.

Previous Next