With the release of JAXP 1.3, a rich XPath API was added to JAXP. The API was designed to be object model neutral, meaning that, assuming the proper classes exist, your code can evaluate XPath expressions on XML objects created by any XML object model as well as return the API-appropriate types for nodes and sets of nodes. In addition to DOM, it is possible to obtain implementations of JAXP XPath interfaces that work with document objects created with JDOM, dom4j, and XOM, among others. (The JDOM and dom4j object models are discussed in Chapters and , respectively.) The standard JAXP distribution, however, only includes support for DOM Document objects.

Java Tip This section is not an exhaustive look at XPath. It specifically discusses the XPath API within JAXP. For more information on XPath, please check out XPath and XPointer by John E. Simpson (Oracle). As an additional caveat, some of the examples use expressions that are more verbose than necessary for illustrative purposes.

The core interface for the JAXP XPath API is javax.xml.xpath.XPath . This interface defines several methods named evaluate( ) for evaluating an XPath expression against an XML document that has already been parsed into a document object or an instance of org.xml.sax.InputSource in the case that the document has not already been parsed. The XPath interface also supports compiling an expression into an XPathExpression object. This functionality, similar to the Templates objects from TrAX, allows you to avoid the overhead of repeated compilation if you are going to use the same expression repeatedly. Also like Templates objects, XPathExpression objects are thread-safe and can be used by multiple threads simultaneously. contains a UML diagram of the JAXP XPath interfaces. I am also including NamespaceContext, which isn't strictly an XPath class (in fact, it's in the javax.xml.namespace package whereas the rest of these interfaces are in the javax.xml.xpath package), but I do discuss it in this section.

JAXP XPath interfaces

Java ScreenShot

Evaluating an XPath expression returns one of five types of responses. JAXP defines Java constants to represent these types; each object model implementation maps these constants to an actual Java type. The requested return type is passed as a javax.xml.namespace.QName object to the evaluate method. If no return type is requested, the STRING type is used. The five return types are listed in along with the constant from javax.xml.xpath.XPathConstants for that return type and the Java type that is mapped to that XPath type in the DOM implementation.

Table 7-1. SPath return types

Return type QName constants Java type
STRING
XPathConstants.STRING
java.lang.String
NUMBER
XPathConstants.NUMBER
java.lang.Double
BOOLEAN
XPathConstants.BOOLEAN
java.lang.Boolean
NODE
XPathConstants.NODE
org.w3c.dom.Node
NODESET
XPathConstants.NODESET
org.w3c.dom.NodeList

Creating an XPath Instance

To obtain an instance of the XPath interface, you must first create an instance of javax.xml.xpath.XPathFactory through one of the static newInstance( ) methods of XPathFactory. To create an XPathFactory instance to work with and return DOM objects, call newInstance( ) without any arguments. To create an XPathFactory instance for a different object model, pass the URI assigned to that object model to newInstance( ). For example, to create an XPathFactory that uses JDOM, write:

XPathFactory factory = XPathFactory.newInstance("http://jdom.org/jaxp/xpath/jdom");

The newInstance( ) method will find the appropriate implementation of XPathFactory in a manner similar to the other JAXP factories. First, it will look for a system property named javax.xml.xpath.XPathFactory: uri where uri is the URI passed to newInstance( ). If this property exists, it's assumed to be a class name, and the class with that name is instantiated and returned by newInstance( ). If no such system property exists, a property with the same name is searched for in lib/jaxp.properties and, if it exists, is assumed to be a class name. Finally, resources named META-INF/services/javax.xml.xpath.XPathFactory are searched for on the classpath. If no resources are found that contain the URI, the default XPathFactory implementation is returned. Once you've obtained an implementation of XPathFactory, an XPath instance is created by a call to newXPath( ).

XPath Examples

I'll be using the XML document in to demonstrate the basic capabilities of the JAXP XPath API. To simplify the examples, this document is assumed to be in a file named tds.xml.

Example Document for XPath examples

<?xml version="1.0" encoding="UTF-8"?>
<schedule name="the daily show with jon stewart" series >
 <show date="06.12.06" weekday="Monday" dayNumber="12">
 <guest>
 <name>Thomas Friedman</name>
 <credit>Author of "The World is Flat"</credit>
 </guest>
 </show>
 <show date="06.13.06" weekday="Tuesday" dayNumber="13">
 <guest>
 <name>Ken Mehlman</name>
 <credit>Chair of the Republican National Committee</credit>
 </guest>
 </show>
 <show date="06.14.06" weekday="Wednesday" dayNumber="14">
 <guest>
 <name>Tim Russert</name>
 <credit>Host of NBC's "Meet the Press"</credit>
 </guest>
 </show>
 <show date="06.15.06" weekday="Thursday" dayNumber="15">
 <guest>
 <name>Louis C.K.</name>
 <credit>Star of HBO series "Lucky Louie"</credit>
 </guest>
 </show>
</schedule>

To start, let's write an XPath expression that gets the name attribute of the root schedule element. This can be written as:

/schedule/@name

Wrapping this expression in some Java code is as simple as creating an XPath instance and calling the evaluate( ) method:

package javaxml3;
import java.io.FileReader;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathFactory;
import org.xml.sax.InputSource;
public class GetName {
 public static void main(String[] args) throws Exception {
 XPathFactory factory = XPathFactory.newInstance( );
 XPath xPath = factory.newXPath( );
 String result = xPath.evaluate("/schedule/@name", new InputSource(
 new FileReader("tds.xml")));
 System.out.println(result);
 }
}

Running this code outputs "The Daily Show with Jon Stewart" to the console. Note that because we aren't specifying a result type, the XPath expression returns the value of the attribute as a String. If, instead, we wanted to get a DOM Attr object, we could pass XPathConstants.NODE to evaluate( ):

 Attr result = (Attr) xPath.evaluate("/schedule/@name", new InputSource(
 new FileReader("tds.xml")), XPathConstants.NODE);
 System.out.println(result.getValue( ));

This would enable us to call the methods on the Attr interface to discover information about the attribute other than just the value, such as whether the value of the attribute was specified in the document or is the default from a DTD. We can also use the NUMBER return type to have JAXP do any numeric conversion for us:

 Double result = (Double) xPath.evaluate("/schedule/@seriesId",
 new InputSource(new FileReader("tds.xml")),
 XPathConstants.NUMBER);
 System.out.println(result.intValue);

For the result types STRING, NUMBER, BOOLEAN, and NODE, if there are multiple nodes that match the expression, only the first result is returned. When you want to get a list of nodes, you need to use the NODESET return type. In the DOM implementation, this returns an org.w3c.dom.NodeList object , which can then be looped over. To start, let's get a NodeList containing all of the show elements and output the node count to the console:

 NodeList shows = (NodeList) xPath.evaluate("/schedule/show",
 new InputSource(new FileReader("tds.xml")),
 XPathConstants.NODESET);
 System.out.println("Document has " + shows.getLength( ) + " shows.");

Then we can iterate over the NodeList with a simple for loop. In this case, each of the Node objects within the NodeList are Elements. Since these are DOM objects, we can evaluate other XPath expressions against them. Putting this to use to output a listing of various elements and attribute values looks like:

 for (int i = 0; i < shows.getLength( ); i++) {
 Element show = (Element) shows.item(i);
 String guestName = xPath.evaluate("guest/name/text( )", show);
 String guestCredit = xPath.evaluate("guest/credit/text( )", show);
 System.out.println(show.getAttribute("weekday") + ", "
 + show.getAttribute("date") + " - " + guestName + " ("
 + guestCredit + ")");
 }

Namespaces in XPath

XPath allows you to reference elements and attributes that are assigned to a namespace through the same prefix notation as you would within an XML document. If we rewrite the XML document in such that the schedule and show elements are in a namespace with the URI uri:comedy:schedule and the guest, name, and credit elements are in a namespace with the URI uri:comedy:guest, it could look like this:

<?xml version="1.0" encoding="UTF-8"?>
<schedule name="the daily show with jon stewart" series 
 xmlns="uri:comedy:schedule" xmlns:g="uri:comedy:guest">
 <show date="06.12.06" weekday="Monday" dayNumber="12">
 <g:guest>
 <g:name>Thomas Friedman</g:name>
 <g:credit>Author of "The World is Flat"</g:credit>
 </g:guest>
 </show>
 <show date="06.13.06" weekday="Tuesday" dayNumber="13">
 <g:guest>
 <g:name>Ken Mehlman</g:name>
 <g:credit>Chair of the Republican National Committee</g:credit>
 </g:guest>
 </show>
 <show date="06.14.06" weekday="Wednesday" dayNumber="14">
 <g:guest>
 <g:name>Tim Russert</g:name>
 <g:credit>Host of NBC's "Meet the Press"</g:credit>
 </g:guest>
 </show>
 <show date="06.15.06" weekday="Thursday" dayNumber="15">
 <g:guest>
 <g:name>Louis C.K.</g:name>
 <g:credit>Star of HBO series "Luckie Louie"</g:credit>
 </g:guest>
 </show>
</schedule>

To create an XPath expression that evaluates to all the guest elements, you might think you could write an expression such as:

/schedule/show/g:guest

But this won't work for two reasons. First, although in the XML document the schedule and show elements are in the default namespace, XPath doesn't support a default namespace. The second reason this won't work is that we haven't associated the g prefix with uri:comedy:guestnamespace declarations in an XML document do not apply to XPath expressions evaluated against it. There are two solutions for this. The first, and less preferable, is to qualify all elements with the namespace URI, such as:

/uri:comedy:schedule:schedule/uri:comedy:schedule:show/uri:comedy:guest:guest

Although this leads to ugly expressions, it can be useful if used sparingly. The second solution is to provide an implementation of the javax.xml.namespace.NamespaceContext interface . This interface defines three methods for mapping a single prefix to a URI, a URI to a single prefix, and a URI to multiple prefixes. contains an implementation of NamespaceContext backed by two HashMaps.

Example SimpleNamespaceContext

package javaxml3;
import java.util.Collections;
import java.util.HashMap;
import java.util.HashSet;
import java.util.Iterator;
import java.util.Map;
import java.util.Set;
import javax.xml.XMLConstants;
import javax.xml.namespace.NamespaceContext;
public class SimpleNamespaceContext implements NamespaceContext {
 private Map urisByPrefix = new HashMap( );
 private Map prefixesByURI = new HashMap( );
 public SimpleNamespaceContext( ) {
 // prepopulate with xml and xmlns prefixes
 // per JavaDoc of NamespaceContext interface
 addNamespace(XMLConstants.XML_NS_PREFIX, XMLConstants.XML_NS_URI);
 addNamespace(XMLConstants.XMLNS_ATTRIBUTE,
 XMLConstants.XMLNS_ATTRIBUTE_NS_URI);
 }
 public synchronized void addNamespace(String prefix, String namespaceURI) {
 urisByPrefix.put(prefix, namespaceURI);
 if (prefixesByURI.containsKey(namespaceURI)) {
 ((Set) prefixesByURI.get(namespaceURI)).add(prefix);
 } else {
 Set set = new HashSet( );
 set.add(prefix);
 prefixesByURI.put(namespaceURI, set);
 }
 }
 public String getNamespaceURI(String prefix) {
 if (prefix == null)
 throw new IllegalArgumentException("prefix cannot be null");
 if (urisByPrefix.containsKey(prefix))
 return (String) urisByPrefix.get(prefix);
 else
 return XMLConstants.NULL_NS_URI;
 }
 public String getPrefix(String namespaceURI) {
 return (String) getPrefixes(namespaceURI).next( );
 }
 public Iterator getPrefixes(String namespaceURI) {
 if (namespaceURI == null)
 throw new IllegalArgumentException("namespaceURI cannot be null");
 if (prefixesByURI.containsKey(namespaceURI)) {
 return ((Set) prefixesByURI.get(namespaceURI)).iterator( );
 } else {
 return Collections.EMPTY_SET.iterator( );
 }
 }
}

We can put this together with the code from the last section to produce:

XPathFactory factory = XPathFactory.newInstance( );
XPath xPath = factory.newXPath( );
SimpleNamespaceContext nsContext = new SimpleNamespaceContext( );
xPath.setNamespaceContext(nsContext);
nsContext.addNamespace("s", "uri:comedy:schedule");
nsContext.addNamespace("g", "uri:comedy:guest");
NodeList shows = (NodeList) xPath.evaluate("/s:schedule/s:show",
 new InputSource(new FileReader("tds_ns.xml")),
 XPathConstants.NODESET);
System.out.println("Document has " + shows.getLength( ) + " shows.");
for (int i = 0; i < shows.getLength( ); i++) {
 Element show = (Element) shows.item(i);
 String guestName = xPath.evaluate("g:guest/g:name/text( )", show);
 String guestCredit = xPath.evaluate("g:guest/g:credit/text( )", show);
 System.out.println(show.getAttribute("weekday") + ", "
 + show.getAttribute("date") + " - " + guestName + " ("
 + guestCredit + ")");
}
Java Tip

Note that because the namespace prefixes used in the XPath expressions are not connected to the prefixes in the actual document, we didn't have to use g as the prefix for uri:comedy:guest. We could have used anything (other than s).


XPath Variables

XPath expressions can contain variables that are interpolated when the expression is evaluated. These variables are indicated by the use of the dollar sign ($) character. XPath variables can be useful in a variety of ways; for the Java developers, perhaps most so when compiling an XPath expression into an XPathExpression object for repeat use. For example, an expression that would find the appropriate show element for a date might look like:

/schedule/show[@date=$date]/guest

However, passing this expression to the evaluate( ) method such as in throws a NullPointerException, because if the XPathExpression or XPath object sees a variable reference, it expects to have an implementation of XPathVariableResolver defined.

Example XPathExpression with variable

XPathFactory factory = XPathFactory.newInstance( );
XPath xPath = factory.newXPath( );
XPathExpression exp = xPath.compile("/schedule/show[@date=$date]/guest");
// this next line throws a NullPointerException Element element = (Element) exp.evaluate(inputSource, XPathConstants.NODE);

Implementations of XPathVariableResolver resolve variable values based on a javax.xml.namespace.QName object. The resolved variable can be any type, but should really be only one of the XPath return types. Since we only care about the one variable named $date, we can implement a resolver that always returns the same object regardless of the QName value:

class StaticVariableResolver implements StaticVariableResolver {
 private Object value = null;
 StaticVariableResolver(Object value) {
 this.value = value;
 }
 public Object resolveVariable(QName name) {
 return value;
 }
}

Then we create an instance of this class and pass it to our newly created XPath object:

XPathFactory factory = XPathFactory.newInstance( );
XPath xPath = factory.newXPath( );
xPath.setXPathVariableResolver(new StaticVariableResolver("06.12.2006"));
XPathExpression exp = xPath.compile("/schedule/show[@date=$date]/guest");
// this next line throws a NullPointerException Element element = (Element) exp.evaluate(inputSource, XPathConstants.NODE);

This is obviously of limited usefulness as we had to set the value of the variable in advance of the expression compilation, let alone the evaluation. To make a more useful implementation of XPathVariableResolver, we can have the implementation backed by a Map with QName objects as the keys:

package javaxml3;
import java.util.HashMap;
import javax.xml.namespace.QName;
import javax.xml.xpath.XPathVariableResolver;
public class MapVariableResolver implements XPathVariableResolver {
 private HashMap variables = new HashMap( );
 public void addVariable(String namespaceURI, String localName, Object value) {
 addVariable(new QName(namespaceURI, localName), value);
 }
 public void addVariable(QName name, Object value) {
 variables.put(name, value);
 }
 public Object resolveVariable(QName name) {
 Object retval = variables.get(name);
 return retval;
 }
}

Which then allows us to set the value of the variable after the expression has been compiled, leading us to the full GuestManager class:

package javaxml3;
import java.io.File;
import java.io.IOException;
import java.text.SimpleDateFormat;
import java.util.Date;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpression;
import javax.xml.xpath.XPathExpressionException;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.xml.sax.SAXException;
public class GuestManager {
 private Document document;
 private XPathExpression expression;
 private MapVariableResolver resolver = new MapVariableResolver( );
 private SimpleDateFormat xmlDateFormat = new SimpleDateFormat("MM.dd.yy");
 public GuestManager(String fileName) throws ParserConfigurationException,
 SAXException, IOException, XPathExpressionException {
 DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance( );
 DocumentBuilder builder = dbf.newDocumentBuilder( );
 document = builder.parse(new File(fileName));
 XPathFactory factory = XPathFactory.newInstance( );
 XPath xPath = factory.newXPath( );
 xPath.setXPathVariableResolver(resolver);
 expression = xPath.compile("/schedule/show[@date=$date]/guest");
 }
 public synchronized Element getGuest(Date guestDate)
 throws XPathExpressionException {
 String formattedDate = xmlDateFormat.format(guestDate);
 resolver.addVariable(null, "date", formattedDate);
 return (Element) expression.evaluate(document, XPathConstants.NODE);
 }
 public static void main(String[] args) throws Exception {
 GuestManager gm = new GuestManager("tds.xml");
 Element guest = gm.getGuest(new Date(2006, 5, 14));
 System.out.println(guest.getElementsByTagName("name").item(0)
 .getTextContent( ));
 }
}
Java Warning

Notice how the getGuest( ) method is synchronized? That's because although the XPathExpression class is thread-safe by itself, our use of XPathVariableResolver is not. If there were multiple threads calling getGuest( ), one thread could execute the addVariable( ) call during an interval between a second thread's call to addVariable( ) and evaluate( ). We could have just synchronized these two lines, locking on the resolver object.


XPath Functions

The XPath specification defines a handful of built-in functions that allow you to access a variety of information about the result of an expression. We've already seen an example of that with the text( ) function, which returns any text content contained within an element. Another commonly seen function is count( ), which counts the number of nodes with a node set. For example, this bit of Java code from previously in this section:

 NodeList shows = (NodeList) xPath.evaluate("/schedule/show",
 new InputSource(new FileReader("tds.xml")),
 XPathConstants.NODESET);
 System.out.println("Document has " + shows.getLength( ) + " shows.");

Could be rewritten as:

 Number shows = (Number) xPath.evaluate("count(/schedule/show)",
 new InputSource(new FileReader("tds.xml")),
 XPathConstants.NUMBER);
 System.out.println("Document has " + shows.intValue( ) + " shows.");

In addition to the built-in functions, XPath allows for custom functions, and the JAXP XPath API provides interfaces for creating your own custom functions. A custom function is encapsulated in a class that implements the javax.xml.xpath.XPathFunction interface. This interface defines a single method, evaluate( ), which takes a List of arguments and returns an Object. If there are no arguments to a function, the arguments list may be null. You can't use XPathFunction to override any of the built-in XPath functions. Even if you try, the API will ignore you. In addition to creating the function class, you must also create an implementation of the XPathFunctionResolver interface. This interface, which is closely related to the XPathVariableResolver interface, defines a method named resolveFunction( ) that accepts a qualified name and an arity value and returns the function object. As with XPathVariableResolver and NamespaceContext before it, the implementation of XPathFunctionResolver is set for an XPath object by calling setXPathFunctionResolver( ).

What's Arity?

Arity is a fancy word for the number of arguments a function (or method, if you're talking about Java) accepts. This value can be useful if you want to have separate function classes for two custom functions with the same name that accept a different number of parameters. If you test for the appropriate arity value, you should still check the argument list length in your function class's evaluate( ) method and throw an exception if the incorrect number of arguments has been passed.

Every reference to a custom function must be qualified with a namespace URI or a mapped prefix. contains a complete example with implementations of XPathFunction and XPathFunctionResolver, as well as the use of an XPath expression that calls this custom function.

Example XPathFunction in use

package javaxml3;
import java.io.FileReader;
import java.util.List;
import javax.xml.namespace.QName;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathFactory;
import javax.xml.xpath.XPathFunction;
import javax.xml.xpath.XPathFunctionException;
import javax.xml.xpath.XPathFunctionResolver;
import org.w3c.dom.Element;
import org.w3c.dom.NodeList;
import org.xml.sax.InputSource;
class SampleFunction implements XPathFunction {
 public Object evaluate(List args) throws XPathFunctionException {
 if (args.size( ) != 1)
 throw new XPathFunctionException("I need exactly one argument");
 // args is a single guest node
 NodeList guestNodes = (NodeList) args.get(0);
 Element guest = (Element) guestNodes.item(0);
 NodeList nameNodes = guest.getElementsByTagNameNS("uri:comedy:guest",
 "name");
 NodeList creditNodes = guest.getElementsByTagNameNS("uri:comedy:guest",
 "credit");
 return evaluate(nameNodes, creditNodes);
 }
 private String evaluate(NodeList nameNodes, NodeList creditNodes) {
 return new String("I hope " + nameNodes.item(0).getTextContent( )
 + " makes a good joke about being "
 + creditNodes.item(0).getTextContent( ));
 }
}
class SampleFunctionResolver implements XPathFunctionResolver {
 public XPathFunction resolveFunction(QName functionName, int arity) {
 if ("uri:comedy:guest".equals(functionName.getNamespaceURI( ))
 && "joke".equals(functionName.getLocalPart( )) && (arity == 1)) {
 return new SampleFunction( );
 } else
 return null;
 }
}
public class FunctionExample {
 public static void main(String[] args) throws Exception {
 XPathFactory factory = XPathFactory.newInstance( );
 XPath xPath = factory.newXPath( );
 SimpleNamespaceContext nsContext = new SimpleNamespaceContext( );
 xPath.setNamespaceContext(nsContext);
 nsContext.addNamespace("s", "uri:comedy:schedule");
 nsContext.addNamespace("g", "uri:comedy:guest");
 xPath.setXPathFunctionResolver(new SampleFunctionResolver( ));
 NodeList shows = (NodeList) xPath.evaluate("/s:schedule/s:show",
 new InputSource(new FileReader("tds_ns.xml")),
 XPathConstants.NODESET);
 for (int i = 0; i < shows.getLength( ); i++) {
 Element show = (Element) shows.item(i);
 String guestJoke = xPath.evaluate("g:joke(g:guest)", show);
 System.out
 .println(show.getAttribute("weekday") + " - " + guestJoke);
 }
 }
}