Common Issues with JDOM - XML - Java Programming Language

The following sections discuss some issues you may encounter when working with JDOM.

What Parser Am I Using?

Although I stated it previously, it is worth repeating that JDOM is not an XML parser. It uses an external parser through a builder class. As a result, what frequently appears to be a JDOM issue is actually a problem with the underlying processor. Be sure you understand which parser you are using or specify the parser class directly with the appropriate constructor of SAXBuilder.

JDOM Isn't DOM

First and foremost, you should realize that JDOM isn't DOM. It doesn't wrap DOM, and doesn't provide extensions to DOM. In other words, the two have no technical relation to each other. Realizing this basic truth will save you a lot of time and effort; there are many articles out there today that talk about getting the DOM interfaces to use JDOM, or avoiding JDOM because it hides some of DOM's methods. These statements are more likely to confuse than clarify. You don't need to have the DOM interfaces, and DOM calls (like appendChild( ) or createDocument( )) simply won't work on JDOM. Sorry, wrong API!

Null Return Values

Another interesting facet of JDOM, and one that has raised some controversy, is the return values from methods that retrieve element content. For example, the various getChild( ) methods on the Element class may return a null value. I mentioned this, and demonstrated it, in the PropsToXML example code. The gotcha occurs when instead of checking if an element exists (as was the case in the example code), you assume that an element already exists. This is most common when some other app or component sends you XML, and your code expects it to conform to a certain format (be it a DTD, XML Schema, or simply an agreed-upon standard). For example, take a look at the following code:

Document doc = otherComponent.getDocument( );
String price = doc.getRootElement( ).getChild("item")
 .getChild("price")
 .getTextTrim( );

The problem in this code is that if there is no item element under the root, or no price element under that, a null value is returned from the getChild( ) method invocations. Suddenly, this innocuous-looking code begins to emit NullPointerExceptions, which are quite painful to track down. You can handle this situation in one of two ways. The first is to check for null values each step of the way:

Document doc = otherComponent.getDocument( );
Element root = doc.getRootElement( );
Element item = root.getChild("item");
if (item != null) {
 Element price = item.getChild("price");
 if (price != null) {
 String price = price.getTextTrim( );
 } else {
 // Handle exceptional condition
 }
} else {
 // Handle exceptional condition
}

The second option is to wrap the entire code fragment in a TRy/catch block:

Document doc = otherComponent.getDocument( );
try {
 String price = doc.getRootElement( ).getChild("item")
 .getChild("price")
 .getTextTrim( );
} catch (NullPointerException e) {
 // Handle exceptional condition
}

While either approach works, I recommend the first. It allows finer-grained error handling, since it is possible to determine exactly which test failed and therefore determine exactly what problem occurred. The second code fragment informs you only that somewhere a problem occurred. In any case, careful testing of return values can save you some rather annoying NullPointerExceptions. The Element class does have a handful of methods that deal with a common NullPointerException case: getting the text of a child element. These methodsnamed getChildText( ), getChildTextTrim( ), and getChildTextNormalize( )return the text of a child element if the child element exists and null if the child element does not. Given the following XML document:

<books>
 <book>
 <name>Java &amp; XML</name>
 <pubDate>2006</pubDate>
 </book>
 <book>
 <name>Java In a Nutshell</name>
 </book>
</books>

The following code would produce a NullPointerException:

Iterator it = tutorials.getChildren( ).iterator( );
while (it.hasNext( ) {
 Element tutorial = (Element) it.next( );
 System.out.println(book.getChild("name").getText( ) +
 " was published in " + tutorial.getChild("pubDate").getText( ));
}

But this code will not:

Iterator it = tutorials.getChildren( ).iterator( );
while (it.hasNext( ) {
 Element tutorial = (Element) it.next( );
 System.out.println(book.getChildText("name") +
 " was published in " + tutorial.getChildText("pubDate"));
}

Instead of throwing a NullPointerException, it outputs:

Java & XML was published in 2006
Java In a Nutshell was published in null

Nodes Have Only One Parent

In the JDOM object model, node objects can only have at most one parent. This parent could be a Document object, in the case of the root element, or an Element. When you add a child object to a parent object, the parent object (which could be a Document or Element object) checks if the child object already has a parent. If it does, an org.jdom.IllegalAddException is thrown. This is commonly seen when taking an Element from one document and adding it to another. To remove the relationship between the child object and its parent, you can either pass the child object to the parent's removeContent( ) method or call the child's detach( ) method.

More on Subclassing

Since I covered factories and custom classes in this chapter, it is worth pointing out a few important things about subclassing that can be "gotcha" items. When you extend a class, and in particular the JDOM classes, ensure that your custom behavior is going to be activated as you want it. In other words, ensure that there is no path from an app through your subclass and to the superclass that isn't a path you are willing to live with. In almost every case, this involves ensuring that you override each constructor of the superclass. You'll notice that in Example 9-12, the ORAElement class, I overrode all four of the Element class's constructors. This ensured that any app using ORAElement would have to create the object with one of these constructors. While that might seem like a trivial detail, imagine if I had left out the constructor that took in a name and URI for the element. This step effectively reduces the number of ways to construct the object by one. That might seem trivial, but it's not! Continuing with this hypothetical, you implement a CustomJDOMFactory class, like the one shown in Example 9-13, and override the various element( ) methods. However, you would probably forget to override element(String name, String uri), since you already forgot to override that constructor in your subclass. Suddenly, we have a problem. Every time an element is requested by name and URI (which is quite often in the SAXBuilder process), you are going to get a plain, vanilla Element instance. However, the other element creation methods all return instances of ORAElement. Just like that, because of one lousy constructor, your document is going to have two element implementations, almost certainly not what you wanted. It is crucial to inspect every means of object creation in your subclasses, and generally make sure you override every constructor that is public in the superclass.

Creating Invalid XML

Another tricky problem to watch out for when subclassing is inadvertently creating invalid XML. Using JDOM, it's more or less impossible to create XML that is not well-formed, but consider the ORAElement subclass again. This subclass added the ora prefix to every element, which alone could cause it to fail validation. This is probably not a big deal, but you do need to comment out or remove the DOCTYPE declaration to avoid problems when reading the document back in. Even more important, you can get some unexpected results if you aren't careful. Look at this fragment of the XML generated using the ORAElement subclass, which only shows the last little bit of the serialized document:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE tutorial SYSTEM "DTD/JavaXML.dtd">
<!-- Java and XML Contents -->
<ora:book xmlns:ora="http://www.oracle.com">
 <ora:title ora:series="Java">Java and XML</ora:title>
 <!-- Other content -->
 <ora:copyright>
<ora:copyright>
 <ora:year value="2001" />
 <ora:content>All Rights Reserved, Oracle &amp; Associates</ora:content>
</ora:copyright>
</ora:copyright>
</ora:book>

Notice that there are now two ora:copyright elements! What happened is that an existing element was in place in the Oracle namespace (the original ora:copyright element). However, the element nested within that, with no namespace, was also assigned the ora prefix and Oracle namespace through the ORAElement class. The result is two elements with the same name and namespace, but differing content models. This makes validation very tricky, and is probably not what you intended. These are simple examples, but in more complex documents with more complex subclasses, you will need to watch carefully what results you are generating, particularly with respect to a DTD, XML Schema, or other form of document constraints.