PropsToXML - XML - Java Programming Language

To put some real code to the task of learning JDOM, let me introduce the PropsToXML class. This class is a utility that takes a standard Java properties file and converts it to an XML equivalent. Many developers out there have requested a means of doing exactly this; it often allows legacy apps using properties files to easily convert to using XML without the overhead of manually converting the configuration files.

Java Properties Files

If you have never worked with Java properties files, they are essentially files with name-value pairs that can be read easily with some Java classes (for instance, the java.util.Properties class) . These files often look similar to Example 9-1, and in fact, I will use this example properties file throughout the rest of the chapter. Incidentally, it's from the Enhydra app server.

Example A typical Java properties file

#
# Properties added to System properties
#
# sax parser implementing class org.xml.sax.parser="org.apache.xerces.parsers.SAXParser"
#
# Properties used to start the server
#
# Class used to start the server org.enhydra.initialclass=org.enhydra.multiServer.bootstrap.Bootstrap
# initial arguments passed to the server (replace command line args)
org.enhydra.initialargs="./bootstrap.conf"
# Classpath for the parent top enhydra classloader org.enhydra.classpath="."
# separator for the classpath above org.enhydra.classpath.separator=":"

No big deal here, right? Well, using an instance of the Java Properties class, you can load these properties into the object (using the load(InputStream inputStream) method) and then deal with them like a Hashtable. In fact, the Properties class extends the Hashtable class in Java; nice, huh? The problem is that many people write these files like the example with names separated by a period (.) to form a sort of hierarchical structure. In the example, you would have a top level (the properties file itself), then the org node, and under it the xml and enhydra nodes, and under the enhydra node several nodes, some with values. In other words, you'd expect a structure like the one shown in .

Expected structure of properties shown in

While this sounds good, Java provides no means of accessing the name-value pairs in this manner. It does not give the period any special value, but instead treats it as just another character. So while you can do this:

String classpathValue = Properties.getProperty("org.enhydra.classpath");

You cannot do this:

List enhydraProperties = Properties.getProperties("org.enhydra");

You would expect (or at least I do!) that the latter would work, and provide you all the subproperties with the structure org.enhydra (org.enhydra.classpath, org.enhydra.initialargs, etc.). Unfortunately, that's not part of the Properties class. For this reason, many developers have had to write their own little wrapper methods around this object, which of course is nonstandard and a bit of a nuisance. Wouldn't it be nice if this information could be modeled in XML, where operations like the second example are simple? That's exactly what I want to write code to do, and I will use JDOM to demonstrate that API.

Converting to XML

As in previous chapters, it is easiest to start with a skeleton for the class and build out. For the PropsToXML class, I want to allow a properties file to be supplied for input, and the name of a file for the XML output. The class reads in the properties file, converts it to an XML document using JDOM, and outputs it to the specified filename. Example 9-2 starts the ball rolling.

Example The skeleton of thePropsToXML class

package javaxml3;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.Enumeration;
import java.util.Properties;
import org.jdom.Document;
import org.jdom.Element; import org.jdom.output.Format;
import org.jdom.output.XMLOutputter;
public class PropsToXML {
 /**
 * <p> This will take the supplied properties file, and
 * convert that file to an XML representation, which is
 * then output to the supplied XML document filename. </p>
 *
 * @param propertiesFilename file to read in as Java properties.
 * @param xmlFilename file to output XML representation to.
 * @throws <code>IOException</code> - when errors occur.
 */
 public void convert(String propertiesFilename, String xmlFilename)
 throws IOException {
 // Get Java Properties object
 FileInputStream input = new FileInputStream(propertiesFilename);
 Properties props = new Properties( );
 props.load(input);
 // Convert to XML
 convertToXML(props, xmlFilename);
 }
 /**
 * <p> This will handle the detail of conversion from a Java
 * <code>Properties</code> object to an XML document. </p>
 *
 * @param props <code>Properties</code> object to use as input.
 * @param xmlFilename file to output XML to.
 * @throws <code>IOException</code> - when errors occur.
 */
 private void convertToXML(Properties props, String xmlFilename)
 throws IOException {
 // JDOM conversion code goes here
 }
 /**
 * <p> Provide a static entry point for running. </p>
 */
 public static void main(String[] args) {
 if (args.length != 2) {
 System.out.println("Usage: java javaxml3.PropsToXML " +
 "[properties file] [XML file for output]");
 System.exit(0);
 }
 try {
 PropsToXML propsToXML = new PropsToXML( );
 propsToXML.convert(args[0], args[1]);
 } catch (Exception e) {
 e.printStackTrace( );
 } }
}

The only new part of this code is the Java Properties object, which I've mentioned previously. The supplied properties filename is used in the load( ) method, and that object is delegated on to a method that will use JDOM, which I will turn to next.

Creating XML with JDOM

Once the code has the properties in a (more) usable form, it's time to start using JDOM. The first task is to create a JDOM Document object and an Element object for the root element. The root element for a Document can be set either in the constructor or through the setRootElement( ) method. Creating an Element requires only the passing of the element's name. There are alternate versions that take in namespace information, and I will discuss those a little later. For now, it's easiest to use the root element's name, and since this needs to be a top-level, arbitrary name (to contain all the property nestings); I use "properties" in the code. Once this element is created, it's used to create a new JDOM Document. Then, it's on to dealing with the properties in the supplied file. The list of property names is obtained as a Java Enumeration through the Properties object's propertyNames( ) method. Once that name is available, it can be used to obtain the property value by using the getProperty( ) method. At this point, you've got the root element of the new XML document, the property name to add, and the value for that property. And then, like any other good program, you iterate through all of the other properties until finished. At each step, this information is supplied to a new method, createXMLRepresentation( ). This performs the logic for handling conversion of a single property into a set of XML elements. Add this code, as shown here, to your source file:

 private void convertToXML(Properties props, String xmlFilename)
 throws IOException {
 // Create a new JDOM Document with a root element "properties"
 Element root = new Element("properties");
 Document doc = new Document(root);
 // Get the property names
 Enumeration propertyNames = props.propertyNames( );
 while (propertyNames.hasMoreElements( )) {
 String propertyName = (String)propertyNames.nextElement( );
 String propertyValue = props.getProperty(propertyName);
 createXMLRepresentation(root, propertyName, propertyValue);
 } // Output document to supplied filename
 Format format = Format.getPrettyFormat( );
 XMLOutputter outputter = new XMLOutputter(format);
 FileOutputStream output = new FileOutputStream(xmlFilename);
 outputter.output(doc, output);
 }

Don't worry about the last few lines that output the JDOM Document yet. I'll deal with this in the next section, but first I want to cover the createXMLRepresentation( ) method, which contains the logic for dealing with a single property, and creating XML from it. The logical first step to create the XML representation of a property is also the easiest: take the name of the property and create an Element with that name. You've already seen how to do this; simply pass the name of the element to its constructor. Once the element is created, assign the value of the property as the textual content of the element. This can be done easily enough through the setText( ) method, which of course takes a String. Once the element is ready for use, it can be added as a child of the root element through the addContent( ) method. In fact, any legal JDOM construct can be passed to an element's addContent( ) method, as it accepts any object that implements the org.jdom.Content interface. These include DocType, EntityRef, Comment, ProcessingInstruction, and Text. But I'll get to those later; for now, add the following method into your source file:

 /**
 * <p> This will convert a single property and its value to
 * an XML element and textual value. </p>
 *
 * @param root JDOM root <code>Element</code> to add children to.
 * @param propertyName name to base element creation on.
 * @param propertyValue value to use for property.
 */
 private void createXMLRepresentation(Element root, String propertyName,
 String propertyValue) {
 Element element = new Element(propertyName);
 element.setText(propertyValue);
 root.addContent(element);
 }

At this point, you can compile the source file, and then use the resulting PropsToXML class. Supply a properties file (you can type in or download the enhydra.properties file shown earlier in this chapter), as well as an output filename, as shown here:

/javaxml3/build $ java javaxml3.PropsToXML \
 /javaxml3/ch09/properties/enhydra.properties \
 enhydraProps.xml

This whirs along for a fraction of a second, and then generates an enhydraProps.xml file. Open this up; it should look like Example 9-3.

If you are unfamiliar with *nix, the backslash at the end of each line (\) simply allows for continuation of the command on the next line; Windows users should enter the entire command on one line.

Example First version of theenhydraProps.xml document

<?xml version="1.0" encoding="UTF-8"?>
<properties>
 <org.enhydra.classpath.separator>":"</org.enhydra.classpath.separator>
 <org.enhydra.initialargs>"./bootstrap.conf"</org.enhydra.initialargs>
 <org.enhydra.initialclass>org.enhydra.multiServer.bootstrap.Bootstrap
</org.enhydra.initialclass>
 <org.enhydra.classpath>"."</org.enhydra.classpath>
 <org.xml.sax.parser>"org.apache.xerces.parsers.SAXParser"
</org.xml.sax.parser>
</properties>

Note that the line wraps in the example are for publishing purposes only; in your document, each property with opening tag, text, and closing tag should be on its own line.

In about 50 lines of code, you've gone from Java properties to XML. However, this XML document isn't much better than the properties file: there is still no way to relate the org.enhydra.initialArgs property to the org.enhydra.classpath property. Our job isn't done yet. Instead of using the property name as the element name, the code needs to take the property name and split it on the period delimiters. For each of these "subnames," an element needs to be created and added to the element stack. Then the process can repeat. For the property name org.xml.sax, the following XML structure should result:

<org>
 <xml>
 <sax>[property Value]</sax>
 </xml>
</org>

At each step, using the Element constructor and the addContent( ) method does the trick, and once the name is completely deconstructed, the setText( ) method can be used to set the last element's textual value. The best way is to create a new Element called current and use it as a "pointer" (there aren't any pointers in Javait's just a term). It will always point at the element that content should be added to. At each step, the code also needs to see if the element to be added already exists. For example, the first property, org.xml.sax, creates an org element. When the next property is added (org.enhydra.classpath), the org element does not need to be created again. To facilitate this, the getChild( ) method is used. This method takes the name of the child element to retrieve, and is available to all instances of the Element class. If the child specified exists, that element is returned. However, if no child exists, a null value is returned, and it is on this null value that our code can key. In other words, if the return value is an element, that becomes the current element, and no new element needs to be created (it already exists). However, if the return from the getChild( ) call is null, a new element must be created with the current subname, added as content to the current element, and then the current pointer is moved down the tree. Finally, once the iteration is over, the textual value of the property can be added to the leaf element, which turns out to be (nicely) the element that the current pointer references. Add this code to your source file:

 private void createXMLRepresentation(Element root, String propertyName,
 String propertyValue) {
 /* Element element = new Element(propertyName);
 element.setText(propertyValue);
 root.addContent(element);
 */
 int split;
 String name = propertyName;
 Element current = root;
 Element test = null;
 while ((split = name.indexOf(".")) != -1) {
 String subName = name.substring(0, split);
 name = name.substring(split+1);
 // Check for existing element if ((test = current.getChild(subName)) == null) {
 Element subElement = new Element(subName);
 current.addContent(subElement);
 current = subElement;
 } else {
 current = test;
 }
 }
 // When out of loop, what's left is the final element's name
 Element last = new Element(name); last.setText(propertyValue); current.addContent(last);
 }

With this addition in place, recompile the program and run it again. This time, your output should be a lot nicer, as shown in Example 9-4.

Example Updated output fromPropsToXML

<?xml version="1.0" encoding="UTF-8"?>
<properties>
 <org>
 <enhydra>
 <classpath>
 <separator>":"</separator>
 </classpath>
 <initialargs>"./bootstrap.conf"</initialargs>
 <initialclass>org.enhydra.multiServer.bootstrap.Bootstrap</initialclass>
 <classpath>"."</classpath>
 </enhydra>
 <xml>
 <sax>
 <parser>"org.apache.xerces.parsers.SAXParser"</parser>
 </sax>
 </xml>
 </org>
</properties>

And, just as quickly as you've started in on JDOM, you have the hang of it. However, you might notice that the XML document violates one of the rules of thumb for document design introduced in (in the "DTD Semantics" section, which details usage of elements versus usage of attributes). You see, each property value has a single textual value. That arguably makes the property values suitable as attributes of the last element on the stack, rather than content. Proving that rules are meant to be broken, I prefer them as content in this case. For demonstration purposes, let's look at converting the property values to attributes rather than textual content. This turns out to be quite easy, and can be done in one of two ways. The first is to create an instance of the JDOM Attribute class. The constructor for that class takes the name of the attribute and its value. Then, the resulting instance can be added to the leaf element with that element's setAttribute( ) method. That approach is shown here:

 // When out of loop, what's left is the final element's name
 Element last = new Element(name); /* last.setText(propertyValue); */
 Attribute attribute = new Attribute("value", propertyValue);
 last.setAttribute(attribute);
 current.addContent(last);

If you want to compile the file with these changes, be sure you add an import statement for the Attribute class:

import org.jdom.Attribute;

A slightly easier way is to use one of the convenience methods that JDOM offers. Since adding attributes is such a common task, the Element class provides an overloaded version of setAttribute( ) that takes a name and value, and internally creates an Attribute object. In this case, that approach is a little clearer:

 // When out of loop, what's left is the final element's name Element last = new Element(name); /* last.setText(propertyValue); */
 last.setAttribute("value", propertyValue);
 current.addContent(last);

This works just as well and also avoids having to use an extra import statement. You can make this change, compile the source file, and run the sample program. The new output should match Example 9-5.

Example Output ofPropsToXML using attributes

<?xml version="1.0" encoding="UTF-8"?>
<properties>
 <org>
 <enhydra>
 <classpath>
 <separator value="&quot;:&quot;" />
 </classpath>
 <initialargs value="&quot;./bootstrap.conf&quot;" />
 <initialclass value="org.enhydra.multiServer.bootstrap.Bootstrap" />
 <classpath value="&quot;.&quot;" />
 </enhydra>
 <xml>
 <sax>
 <parser value="&quot;org.apache.xerces.parsers.SAXParser&quot;" />
 </sax>
 </xml>
 </org>
</properties>

Each property value is now an attribute of the innermost element. Notice that JDOM converts the quotation marks within the attribute values, which are disallowed, to entity references so the document as output is well-formed. However, this makes the output a little less clean, so you may want to switch your code back to using textual data within elements, rather than attributes.

Outputting XML with JDOM

Before we continue, I want to spend a little time on the output portion of the code that I mentioned earlier in the chapter. It's highlighted again here:

 private void convertToXML(Properties props, String xmlFilename)
 throws IOException {
 // Create a new JDOM Document with a root element "properties"
 Element root = new Element("properties");
 Document doc = new Document(root);
 // Get the property names
 Enumeration propertyNames = props.propertyNames( );
 while (propertyNames.hasMoreElements( )) {
 String propertyName = (String)propertyNames.nextElement( );
 String propertyValue = props.getProperty(propertyName);
 createXMLRepresentation(root, propertyName, propertyValue);
 } // Output document to supplied filename
 Format format = Format.getPrettyFormat( );
 XMLOutputter outputter = new XMLOutputter(format);
 FileOutputStream output = new FileOutputStream(xmlFilename);
 outputter.output(doc, output);
 }

You already know that XMLOutputter is the class to use for handling output to a file, stream, or other static representation. In addition, I supplied a Format object, which instructs the XMLOutputter to use newlines after each element, to indent each level of element nesting with two spaces, and to trim any whitespace before and after character data.

The Format class

JDOM provides three default formatting schemes, available through static methods on the Format class. We've already seen Format.getPrettyFormat( ). Format.getRawFormat( ) returns the default formatting scheme where no extra formatting is done. Format.getCompactFormat( ) doesn't add indentation or additional newlines, but it does normalize the character data in the document, removing leading and trailing whitespace and replacing instances of multiple contiguous whitespace characters into a single space. The Format class allows for customization of any of these Format instances by these properties:

escapeStrategy: An instance of org.jdom.output.EscapeStrategy that allows you to specify which characters should be escaped for output.
lineSeparator: Which character or characters should be used when the Format instance adds newlines to the output. This setting applies only to newlines created while outputting.
omitEncoding: Setting this property to true tells the outputter to skip the outputting of the name of the character encoding scheme in the XML declaration.
omitDeclaration: Setting this property to true tells the outputter to skip the outputting of the XML declaration entirely.
expandEmptyElements: This property asks the outputter to never output an empty element such as <name/> and always output <name></name>.
ignoreTrAXEscapingPIs: TrAX defines processing instructions that indicate to a TrAX Transformer that the standard XML character entities should not be used to escape the output document as is useful when the result of a transformation will not be parsed as XML. JDOM can recognize these processing instructions and apply that behavior to the outputter.
textMode: Specify a strategy for whitespace handling. The options for textMode are Format.TextMode.PRESERVE, Format.TextMode.TRIM, Format.TextMode.NORMALIZE, and Format.TextMode.TRIM_FULL_WHITE.
indent: Specify one or more characters to use as the indentation string for each level of element nesting. If indent is non-null, a newline is automatically added before the indentation.
encoding: Specify a character encoding for the output.

Through setting these properties, you should be able get the output of XMLFormatter to be just as you want it.

Why Don't My Format Customizations Stick?

One frequent area of confusion is why this code doesn't actually change the encoding of the output:

outputter.getFormat( ).setEncoding("ISO-8859-1");

The reason is that the getFormat( ) method actually returns a copy of the Format object. As a result, you need to pass the modified Format object back to the setFormat( ) method:

Format format = outputter.getFormat( );
format.setEncoding("ISO-8859-1");
outputter.setFormat(format);

Since setEncoding( ) returns the Format object, you could even write this as:

outputter.setFormat(outputter.getFormat( )
 .setEncoding("ISO-8859-1"));

Other uses of XMLOutputter

There are versions of the output( ) method (the one used in the example code) that take either an OutputStream or a Writer and an outputString( ) method that returns a String object. And, although it's generally used to output a Document object, there are versions that take the various JDOM constructs as input, so you could output an entire Document, or just an Element, Comment, ProcessingInstruction, or anything else:

// Create an outputter with 4 space indentation and new lines XMLOutputter outputter = new XMLOutputter(Format.getPrettyFormat( );
// Output different JDOM constructs outputter.output(myDocument, myOutputStream);
outputter.output(myElement, myWriter);
outputter.output(myComment, myOutputStream);
// etc...

In other words, XMLOutputter serves all of your XML output needs. Of course, you can also use DOMOutputter and SAXOutputter.