Transformations
The dom4j has ample support for XML transformations. dom4j objects can be used either with XSL transformations using JAXP or dom4j's rule-based transformation classes. In both cases, you encapsulate the logic used to transform a document and then apply that logic to multiple documents.
TrAX
Document objects created with dom4j can be used as the source or result of transformations done by the TrAX that's part of the JAXP specifications discussed in . This is done with the classes org.dom4j.io.DocumentSource and org.dom4j.io.DocumentResult. These implement the javax.xml.transform.Source and javax.xml.transform.Result interfaces, respectively. DocumentSource and DocumentResult can be used togetherwhere the input and output of a transformation are both dom4j Document objectsor independentlyfor example, a dom4j Document as the input and a String as the output. Example 10-3 contains sample code transforming the contents of an XML file to a dom4j Document object.
Example Transformation from a file to an org.dom4j.Document object using TrAX
TransformerFactory factory = TransformerFactory.newInstance( ); Transformer transformer = factory. newTransformer(new StreamSource("stylesheet.xsl")); StreamSource in = new StreamSource("input.xml"); JDOMResult out = new DocumentResult( ); transformer.transform(in, out); Document resultDocument = out.getDocument( ); |
Rule-Based Transformations
dom4j includes an API for defining a transformation entirely with Java. These transformations are written with a series of org.dom4j.rule.Rule objects contained within in an org.dom4j.rule.Stylesheet object. A Rule object is composed of an implementation of the org.dom4j.rule.Pattern interface, which governs what nodes a Rule applies to, and an implementation of the org.dom4j.rule.Action interface , which performs some action upon the matched nodes. The two implementations of the Pattern interface included with the dom4j distribution are org.dom4j.rule.pattern.NodeTypePattern and org.dom4j.xpath.XPathPattern . These implementations do what their names implyNodeTypePattern matches nodes based on type and XPathPattern matches nodes based on an XPath expression. XPathPattern instances are created with DocumentFactory's createPattern( ) method. The Action interface defines a single method called run( ), which accepts any Node object as a parameter. Implementations are free to modify the node passed to their run( ) method. The easiest way to demonstrate this rule API is to compare a transformation written in XSLT with a transformation written with the rule API. The XSL stylesheet in Example 10-4 takes a child element named pubDate and makes it an attribute. This could be run against a document structured like Example 10-1.
Example Sample XSL stylesheet
<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="books"> <books> <xsl:apply-templates/> </books> </xsl:template> <xsl:template match="book"> <book> <xsl:attribute > <xsl:value-of select="pubDate"/> </xsl:attribute> <xsl:value-of select="title"/> </book> </xsl:template> </xsl:stylesheet> |
This same transformation could be written with the rule API as:
public class RuleExample { class BookAction implements Action { public void run(Node node) throws Exception { if (node instanceof Element) { Element element = (Element) node; Element newElement = element.createCopy( ); // make pubDate an attribute Element pubDateElement = newElement.element("pubDate"); // remove the pubDate element from the current node newElement.remove(pubDateElement); Attribute attr = DocumentHelper.createAttribute(newElement, "pubDate", pubDateElement.getTextTrim( )); newElement.add(attr); // add our new element to the result document's root element rootElement.add(newElement); } } } private Element rootElement; public Document transform(Document input) throws DocumentException { // must be final because we're using it in an inner class final Document result = DocumentHelper.createDocument( ); final Stylesheet style = new Stylesheet( ); Rule tutorialsRule = new Rule(DocumentHelper.createPattern("books"), new Action( ) { public void run(Node node) throws Exception { rootElement = result.addElement("books"); style.applyTemplates(node); } }); Rule tutorialRule = new Rule(DocumentHelper.createPattern("book"), new BookAction( )); style.addRule(booksRule); style.addRule(bookRule); try { style.run(input); } catch (Exception e) { System.err.println("Unable to transform: " + e.getMessage( )); e.printStackTrace( ); } return result; } }
For purposes of the example, I've created implementations of the Action interface both as an anonymous inner class and a named inner class. Both are reasonable options, as is creating a regular public class that implements the Action interface. Which method is correct in a particular case is largely dependent upon what objects need to be accessed from the run( ) method.
|