With that brief introduction to dom4j, let us begin by looking at the interfaces and classes that make up dom4j. We'll start with the core interfaces and then examine some of the special features those interfaces have that set dom4j apart from other similar APIs.

Core dom4j

As I mentioned above, dom4j is built around a set of core interfaces. These interfaces describe the structure and content of an XML document. contains a UML model of these core interfaces.

UML model of dom4j core interfaces

Java ScreenShot

As you can see from the model diagram, dom4j has several levels of interfaces. Every interface ultimately extends the Node interface, which defines common functionality for all components of an XML document and is analogous to org.w3c.dom.Node. The CharacterData and Branch interfaces similarly define common functionality for nodes that contain text and nodes that contain other nodes, respectively.

Factories

Since the core of dom4j is a set of interfaces, you use a factory object to obtain implementations of these interfaces. The default factory class is org.jdom.DocumentFactory. contains the class diagram for DocumentFactory. The various create methods enable you to create instances of the corresponding dom4j interface. Calling createElement( ) returns an Element instance, createAttribute( ) returns an Attribute instance, and so on. The createXPath( ), createXPathFilter( ), and createPattern( ) methods are slightly different in that they return objects that operate on Node objects; these creation methods will be explored in greater depth later in this chapter.

DocumentFactory

Java ScreenShot

DocumentFactory returns instances of classes in the org.dom4j.tree package such as DefaultElement and DefaultAttribute. As you'll see later in the " section, the dom4j distribution comes with a handful of subclasses of DocumentFactory that create alternate implementations of some of the dom4j interfaces. For example, org.jdom.util.IndexedDocumentFactory creates instances of org.dom4j.util.IndexedElement instead of DefaultElement. IndexedElement builds up maps of the element's attributes and child elements. This results in a slight performance hit on every addition of an attribute or child element, but faster results when looking up attributes and child elements by name. Although it's not always necessary that these alternate factories extend DocumentFactory, in practice a special-purpose factory only needs to override a few create methods and thus extends DocumentFactory.

dom4j Features

In addition to the standard document processing mechanisms that are common to DOM, JDOM, and dom4j, dom4j has a few unique features that will be explored later in this chapter.

XPath support

XPath support in dom4j is provided by an XPath class, which allows you to precompile XPath expressions and then evaluate them against some dom4j object, and a handful of methods on the Node interface. These methods, discussed in the " section later on, allow for evaluation of an XPath expression with the Node the method is called upon as the context for the evaluation.

Support for Visitor Pattern

The Node interface also defines a method named accept( ), which gets passed a Visitor object. Through this method, Node interfaces implement the Visitor Pattern. The Visitor Pattern separates the logic necessary to traverse an object structurean XML document in this casefrom the logic one wants to perform on each object within that structure (i.e., Node objects). In dom4j, passing a Visitor object to the accept( ) method of a Node causes the Node object to pass itself to the Visitor object's visit( ) method. If the Node has children, after passing itself to visit( ), it will pass each of its children to the visit( ) method. I will provide more detail in the " section later in this chapter.

Object-orientated transformation API

In addition to supporting the JAXP TrAX API for XSL transformations, dom4j includes its own transformation API that is entirely object-orientated, called the rule API. Through this API, you create a Stylesheet object, which contains one or more Rule objects. Each Rule object defines an action to be taken and a pattern to determine when that action should be taken. Using this API, it is possible to do many of the tasks you would normally do with XSLT without leaving the comforts of Java. We will explore this later in the " section.

The dom4j Distribution

The dom4j distribution is available through . It's available as both a standalone JAR file, in which case you'll have to provide all of the dependencies, or as a full distribution with all dependencies. As usual, the full distribution is available either as a ZIP file for Windows users or a GZipped TAR file for everyone else, including those using Unix, Linux, and OS X. As with JDOM, the source is included with the full distribution and there is no separate source-only distribution. In the full distribution, the dom4j JAR file is located in the root directory. JAR files dom4j depends upon are in the lib directory. Of the dependency JAR files, only the jaxen JAR file is required for any use of dom4j. The other may be required depending on which parts of dom4j you use. For example, if you wanted to use a StAX parser (discussed in to build dom4j Document objects, you would need the StAX APIs and implementation on your classpath. You now have a good idea of what dom4j can do. In the next section, we'll look at some basic examples for reading, creating, and outputting XML documents with dom4j.