Advanced Topics in HttpUnit

If you have gotten this far in the chapter, you already know enough about HttpUnit to begin developing. With the source code and the reference chapter at your side, you should be able to write some mean Web testing code. The rest of this chapter will be devoted to giving an overview of the remaining issues in the HttpUnit framework and developing a more sophisticated testing program that leverages the power of HttpUnit's Java base. This section covers topics in HttpUnit such as DOM inspection, headers and validation, and HttpUnit's configurable options.

DOM Inspection

HttpUnit provides the ability to inspect an HTML document or part of an HTML document as a DOM. The Document Object Model (DOM) is a standard developed by the World Wide Web Consortium ( that treats documents as objects that can be manipulated. DOM was developed to allow programmatic access to data in languages such as XML and HTML. A full discussion of DOM lies outside the scope of this tutorial; if you need a primer, everyJava journal with online archives is sure to have several articles on XML and its two common program interfaces, DOM and SAX (Simple API for XML). What you need to know about DOM and HttpUnit is that HttpUnit uses JTidy—an HTML parser—to turn a server response into an in-memory DOM whose contents can be accessed at random. Almost all of HttpUnit's powerful assertion capabilities (WebResponse.getTables(), for instance) rely on DOM manipulation under the hood. For instance, WebResponse.getLinks() uses

NodeList nl = NodeUtils.getElementsByTagName( _rootNode, "a" );
Vector list = new Vector();
for (int i = 0; i < nl.getLength(); i++) {
 Node child = nl.item(i);
 if (isLinkAnchor( child )) {
 list.addElement( new WebLink( _baseURL, _baseTarget, child ) );

to find all the Nodes (roughly corresponding to tags, in this case) with the name "a" (as in <a href="blah">) in the underlying HTML. Using the DOMs provided by HttpUnit can be somewhat difficult, simply because DOM coding can be difficult. The DOM API (the Javadocs can be found at was designed to be language independent and thus doesn't jibe with Java as well as it might. Manipulating the HTML DOM can yield information that would otherwise be unavailable for assertion. As an example, let's return to our earlier test of the sales report page. The sales report tables were labeled with captions, an element not specifically searched for or returned by HttpUnit. Let's say that it becomes essential to validate these captions. The following DOM code could find the value of the <caption> element for inspection:

 * @param cell A table cell containing a single nested table.
private String findCaption(TableCell cell) throws Exception{
 Node node = cell.getDOM();
 Element elem = (Element)node;
 NodeList listOfCaptions = elem.getElementsByTagName("caption");
 /*presume only 1*/
 Node firstCaption = listOfCaptions.item(0);
 /*contents are actually contained in a child node of the caption Node*/
 Node contents = firstCaption.getFirstChild();
 return contents.getNodeValue();

As you can see, DOM code can get a bit involved. But, on the bright side, our sales report test is now more accurate!

Headers and Cookies

HttpUnit's WebConversation class internally stores all the headers that will be sent to the Web server with each request. Naturally, this storage covers cookies, authentication headers, and various other request headers. In order to set a header for transmission to the server with all future requests, use WebConversations's setHeaderField(String fieldName, String fieldValue) method. Shortcut methods exist for commonly used headers, such as authorization headers: setAuthorization(String userName, String password). Setting a header field to null will remove the header from all subsequent requests. Cookies are handled slightly differently. WebConversation has methods (getCookieNames() and getCookieValue()) that return all of the cookies names to be sent as well as an individual cookie's value. addCookie(String name, String value) adds a cookie to the list of cookies to send. Unlike with header fields, there is no way to remove an individual cookie from the list. As for server-defined cookies, the WebResponse class allows inspection of new cookies through two methods: getNewCookieNames() and getNewCookieValue(). For example, the following code would print all the new cookies set in a Web server response:

for(int i =0; i < names.length; i++){
 System.out.print(names[i] + " -- ");

This facility allows verification that the server has tried to set a specific cookie in the response.


HttpUnit handles frames in a straightforward manner; the WebClient stores a Hashtable of frames internally. The contents of a given frame can be accessed by name with

WebResponse response = WebClient.getFrameContents("someFrameName")

If a test follows a link from within one frame that updates another frame, the response from the WebConversation will be the contents of the target frame (which will also be accessible through getFrameContents).


HttpUnit can test sites that use Secure Sockets Layer (SSL), but doing so is somewhat involved. The process requires two basic steps: The server must have an SSL certificate installed, and the JVM used by HttpUnit must trust the installed certificate. (Certificates from Verisign or Thawte are automatically trusted.) A number of technical details surround SSL support in Java/HttpUnit, most of which are covered in the SSL FAQ hosted on HttpUnit's SourceForge site (


The HttpUnitOptions class provides a series of static properties that configure the behavior of HttpUnit. HttpUnitOptions provides options that determine whether link searches are case-sensitive, whether to follow page refresh requests automatically, and whether to print headers to the System.out stream as they are sent and received. It is worth noting that because these properties are merely static variables, you need to exercise some care in using them in a multithreaded testing environment (such as running JUnitPerf with HttpUnit) if the options are set to conflicting values in different tests.

Technical Limitations

HttpUnit provides much of the functionality of a Web browser from within a Java app. However, a number of things remain outside its purview. JavaScript is an obvious example. Currently, HttpUnit offers very limited JavaScript support. Russell Gold, the developer of HttpUnit, has stated that he plans on building in full JavaScript 1.1 support in the future, but currently only a subset of JavaScript DOM elements are usable, as well as inline and included scripts. If JavaScript does not function at all, make sure that you have the Rhino JAR (js.jar) in your classpath.

HttpUnit does not forgive bad HTML. This feature can cause problems when you test pages that display correctly in major browsers but do not strictly adhere to the HTML specification. Frequently, the problem has to do with <form> tags, which must be nested correctly within tables. Unfortunately, the HTML must be corrected to fix this problem, although calling HttpUnitOptions.setParserWarnings Enabled(true) will at least indicate HTML problems encountered during parsing.