Document Output with StAX

The StAX specification states that the first design goal for StAX is to provide "symmetrical APIs for reading and writing XML using a streaming paradigm." This is a significant difference from SAX, which provides an API for reading only. Writing XML documents with StAX solves the fundamental problem with using DOM or any DOM-like APIyou do not have to create the entire document in memory before being able to serialize it. Instead, you write events, using the same event vocabulary we've already discussed in this chapter, to a writer object that is attached to an output stream. The writer object will flush the character representation of those events to the output stream as necessary, or when your code requests it by calling the flush( ) method. As a result, it is possible to create massive documents with StAX with a limited amount of memory, something that isn't possible with DOM. As with the reading APIs, there are two main interfaces for document output: XMLStreamWriter and XMLEventWriter. Instances of these are created using the static newInstance( ) method of the abstract class XMLOutputFactory. The concrete implementation of XMLOutputFactory returned by newInstance( ) is determined using the same process described in the section " earlier in this chapter. Once you have obtained an instance of XMLOutputFactory, instances of XMLStreamWriter and XMLEventWriter are obtained by invoking methods named createXMLStreamWriter( ) and createXMLEventWriter( ), respectively. As with the createXMLStreamReader( ) and createXMLEventReader( ) methods of XMLInputFactory, these writer creation methods have several overloaded versions. There are overloaded methods for each that accept:

  • A java.io.Writer
  • A java.io.OutputStream
  • A java.io.OutputStream and a character set encoding
  • A javax.xml.transform.Result

As with XMLInputFactory and javax.xml.transform.Source, support by XMLOutputFactory for javax.xml.transform.Result is optional, and if an implementation does not provide support for Result outputs, both createXMLEventWriter( ) and createXMLStreamWriter( ) will throw a java.lang.UnsupportedOperationException. To keep things simple, in the examples below, we'll be using the System.out OutputStream.

XMLStreamWriter

As with XMLStreamReader, the XMLStreamWriter interface defines quite a few methods. But also like XMLStreamReader, it's an easy-to-use interface that allows for lightweight operations without a lot of extra object creation. The full XMLStreamWriter interface is diagrammed in .

The XMLStreamWriter interface

Java ScreenShot

Writing an XML document with XMLStreamWriter is simply a matter of creating the writer, calling a series of write methods, and then flushing the writer. A simple example that outputs part of the document in is contained in .

Example Simple XMLStreamWriter example

 package javaxml3;
 import javax.xml.stream.XMLOutputFactory;--
 import javax.xml.stream.XMLStreamWriter;
 public class SimpleStreamOutput {
 public static void main(String[] args) throws Exception {
 XMLOutputFactory outputFactory = XMLOutputFactory.newInstance( );
 XMLStreamWriter writer = outputFactory.createXMLStreamWriter(System.out);
 writer.writeStartDocument("1.0");
 writer.writeStartElement("person");
 writer.writeStartElement("name");
 writer.writeStartElement("first_name");
 writer.writeCharacters("Alan");
 writer.writeEndElement( );
 writer.writeEndElement( );
 writer.writeEndElement( );
 writer.writeEndDocument( );
 writer.flush( );
 }
}

When is run, it produces this on the console (all on one line):

<?xml version="1.0"?><person><name><first_name>Alan</first_name></name></person>

StAX will not prevent you from creating XML that is not well-formed. If lines 16 through 19 were omitted from , the output would be:

<?xml version="1.0"?><person><name><first_name>Alan

The one feature XMLStreamWriter has to ensure that a document is well-formed is that writeEndDocument( ) will close any open elements. As a result, the output of would be the same if lines 16 through 18 were removed. XMLStreamWriter will also replace the appropriate characters with the entities &amp;, &lt;, and &gt; inside CHARACTERS events and those entities plus &quot; and &apos; inside ATTRIBUTE events.

What? No Pretty Printing?

If you've used any DOM-style APIs for creating XML documents, you may be surprised that there's no mechanism for pretty printingputting elements on their own lines, using tabs, and so on. If you absolutely need pretty printing, I suggest looking at the JTidy library at . JTidy is a highly configurable standalone XML pretty printeryou pass it a java.io.InputStream, a DOM Document object, or a DOM Node object along with a java.io.OutputStream and JTidy will output a pretty printed version of your document to the OutputStream. To use JTidy with XMLStreamWriter or XMLEventWriter, you could use a ByteArrayOutputStream with the StAX writer and pass the resulting byte array to JTidy with a ByteArrayInputStream or have StAX write the document to a FileOutputStream and pass JTidy a FileInputStream. Before doing this, be sure you really need pretty printing, as there is a performance hit to using JTidy. Perhaps some future implementations of StAX will provide pretty printing through a vendor-specific property, or it will be added to a future version of the specification.

Attributes are added with the writeAttribute( ) methods and are added to the current open element. If there are attributes to be written, they must be written before character data, comments, processing instructions, entity references, and other elements; otherwise, a javax.xml.stream.XMLStreamException will be thrown. shows the proper sequence of method calls.

Example Simple XMLStreamWriter example with attribute

package javaxml3;
import javax.xml.stream.XMLOutputFactory;
import javax.xml.stream.XMLStreamWriter;
public class AttributeStreamOutput {
 public static void main(String[] args) throws Exception {
 XMLOutputFactory outputFactory = XMLOutputFactory.newInstance( );
 XMLStreamWriter writer = outputFactory.createXMLStreamWriter(System.out);
 writer.writeStartDocument("1.0");
 writer.writeStartElement("addresses");
 writer.writeStartElement("address");
 writer.writeAttribute("type", "work");
 writer.writeStartElement("street");
 writer.writeCharacters("1515 Broadway");
 writer.writeComment("in the heart of Times Square");
 writer.writeEndElement( );
 writer.writeEndElement( );
 writer.writeEndElement( );
 writer.writeEndDocument( );
 writer.flush( );
 }
}

XMLStreamWriter also has writeEmptyElement( ) methods, which create empty XML elements. Empty elements follow the same rules as regular elements as pertains to attributes. If character data, a comment, processing instruction, entity reference, or another element is written, the empty element is closed. Because we can force the XMLStreamWriter to flush its buffer to the output stream with the flush( ) method, we can write code such as:

package javaxml3;
import javax.xml.stream.XMLOutputFactory;
import javax.xml.stream.XMLStreamWriter;
public class EmptyElement {
 public static void main(String[] args) throws Exception {
 XMLOutputFactory outputFactory = XMLOutputFactory.newInstance( );
 XMLStreamWriter writer = outputFactory.createXMLStreamWriter(System.out);
 writer.writeEmptyElement("empty");
 writer.flush( ); // write '<empty' to the console
 System.out.println("\n");
 writer.writeAttribute("attribute", "true");
 writer.flush( ); // write ' attribute="true"' to the console
 System.out.println("\n");
 writer.writeEndDocument( );
 writer.flush( ); // write /> to the console
 }
}

This shows that the empty element isn't actually closed until the call to writeEndDocument. Note that in this example, if we hadn't called writeEndDocument( ), XMLStreamWriter would have never closed the empty element. Always remember that StAX does not try to make your document well formed.

Namespace support

Namespace support is provided by XMLStreamWriter in two separate ways. First, namespace URIs and, optionally, prefixes can be attached to elements and attributes using overloaded versions of the writeStartElement( ), writeAttribute( ), and writeEmptyElement( ) methods. Second, the attributes that associate namespace URIs with their prefixes are written to open elements. XMLStreamWriter does not ensure that namespace attributes are written, but it does require that all namespace URIs attached to elements and attributes are associated with a prefix. XMLStreamWriter also supports setting the default namespace for an element. A call to writeStartElement such as:

writer.writeStartElement("ns1", "sample", "http://www.example.com/ns1");

Will result in the following element being written to the output stream:

<ns1:sample>

To write the xmlns attribute, you also need to write a NAMESPACE event using the writeNamespace( ) method:

writer.writeNamespace("ns1", "http://www.example.com/ns1");

These two lines together will output:

<ns1:sample xmlns:ns1="http://www.example.com/ns1">

Subsequent calls to writeStartElement( ), writeEmptyElement( ), and writeAttribute( ) to create elements or attributes in this same namespace need not pass the namespace prefix again, only the namespace URI. When you omit the namespace prefix, XMLStreamWriter will use the prefix defined for the namespace URI in the namespace context. As you write an XML document with XMLStreamWriter, the writer builds a namespace context representing the associations between namespace URIs and prefixes. This context is available as an instance of javax.xml.namespace.NamespaceContext, although you'll generally interact with the context using the namespace methods of XMLStreamWriter. Adding a namespace URI and prefix to the namespace context can be done through five XMLStreamWriter methods:


writeStartElement( prefix , localName , namespaceURI )

Write the START_ELEMENT event and, in addition, add the prefix and namespace URI to the namespace context.


writeEmptyElement( prefix , localName , namespaceURI )

Write the START_ELEMENT event and, in addition, add the prefix and namespace URI to the namespace context.


writeAttribute( prefix , namespaceURI , localName , value )

Write the ATTRIBUTE event and, in addition, add the prefix and namespace URI to the namespace context.


writeNamespace( prefix , namespaceURI )

Write the NAMESPACE event and, in addition, add the prefix and namespace URI to the namespace context.


writeDefaultNamespace( namespaceURI )

Writes the default namespace attribute.


setDefaultNamespace( namespaceURI )

Sets the default namespace to the namespace URI passed to this method.


setPrefix( prefix , namespaceURI )

Adds a prefix and namespace URI to the namespace context.

Because writeNamespace( ) adds a prefix and namespace URI to the namespace context, you should generally be able to use the methods without a prefix. In fact, it's generally a good idea to use the methods without a prefix, because you'll ensure the namespace attribute for that URI has already been written. For example, the code in will throw a javax.xml.stream.XMLStreamException in the call to writeEmptyElement because the namespace URI http://www.example.com/ns2 has not been bound to a prefix.

Example Invalid namespace creation

XMLStreamWriter writer = outputFactory.createXMLStreamWriter(System.out);
writer.writeStartElement("ns1", "sample", "http://www.example.com/ns1");
writer.writeNamespace("ns1", "http://www.example.com/ns1");
writer.writeEmptyElement("http://www.example.com/ns2", "inner");

If, however, the writeEmptyElement( ) call was:

writer.writeEmptyElement("ns2", "inner", "http://www.example.com/ns2");

no exception would be thrown, but our output would not have a namespace attribute defining the prefix for the namespace URI http://www.example.com/ns2. We'll see in the section " later in the chapter how the StAX writer has an optional mode that does not require namespace prefixes to be specified.

XMLEventWriter

As you can see in , the XMLEventWriter interface is significantly shorter than XMLStreamWriter.

The XMLEventWriter interface

Java ScreenShot

XMLEventWriter instances are created using the createXMLEventWriter( ) methods of XMLOutputFactory. To add events to the writer, you use one of the two add( ) methods, passing either a single XMLEvent object or an XMLEventReader object, in which case all of the remaining XMLEvent objects from the XMLEventReader are added to the writer. To add an XMLEvent object, you must obtain an XMLEvent object class using an implementation of the abstract class XMLEventFactory. Obtaining an implementation of XMLEventFactory is done the same way as XMLOutputFactory and XMLInputFactory: by calling XMLEventFactory.newInstance( ). XMLEventFactory has a series of methods for creating various event objects:

public abstract class XMLEventFactory {
 // non-event creation methods omitted
 public abstract Attribute createAttribute(String prefix,
 String namespaceURI, String localName, String value);
 public abstract Attribute createAttribute(String localName, String value);
 public abstract Attribute createAttribute(QName name, String value);
 public abstract Namespace createNamespace(String namespaceURI);
 public abstract Namespace createNamespace(String prefix, String namespaceUri);
 public abstract StartElement createStartElement(QName name,
 Iterator attributes, Iterator namespaces);
 public abstract StartElement createStartElement(String prefix,
 String namespaceUri, String localName);
 public abstract StartElement createStartElement(String prefix,
 String namespaceUri, String localName, Iterator attributes,
 Iterator namespaces);
 public abstract StartElement createStartElement(String prefix,
 String namespaceUri, String localName, Iterator attributes,
 Iterator namespaces, NamespaceContext context);
 public abstract EndElement createEndElement(QName name, Iterator namespaces);
 public abstract EndElement createEndElement(String prefix,
 String namespaceUri, String localName);
 public abstract EndElement createEndElement(String prefix,
 String namespaceUri, String localName, Iterator namespaces);
 public abstract Characters createCharacters(String content);
 public abstract Characters createCData(String content);
 public abstract Characters createSpace(String content);
 public abstract Characters createIgnorableSpace(String content);
 public abstract StartDocument createStartDocument( );
 public abstract StartDocument createStartDocument(String encoding,
 String version, boolean standalone);
 public abstract StartDocument createStartDocument(String encoding,
 String version);
 public abstract StartDocument createStartDocument(String encoding);
 public abstract EndDocument createEndDocument( );
 public abstract EntityReference createEntityReference(String name,
 EntityDeclaration declaration);
 public abstract Comment createComment(String text);
 public abstract ProcessingInstruction createProcessingInstruction(
 String target, String data);
 public abstract DTD createDTD(String dtd);
}

As you can see, there are two ways to add attributes, including namespace declarations, to START_ELEMENT events: by passing two java.util.Iterators to the createStartElement( ) method or by creating Attribute and Namespace event objects and adding them to the XMLEventWriter. As a result, Examples and write the same text to the output stream.

Example Adding Attribute and Namespace objects with an Iterator

Namespace ns1 = eventFactory.createNamespace("ns1","http://www.example.com/ns1");
Namespace ns2 = eventFactory.createNamespace("ns2","http://www.example.com/ns2");
List namespaceList = new ArrayList( );
namespaceList.add(ns1);
namespaceList.add(ns2);
Attribute attribute = eventFactory.createAttribute("ns2",
 "http://www.example.com/ns2", "attribute", "true");
List attributeList = Collections.singletonList(attribute);
writer.add(eventFactory.createStartElement(ns1.getPrefix( ),
 ns1.getNamespaceURI(), "sample", attributeList.iterator( ),
 namespaceList.iterator( )));
writer.add(eventFactory.createEndElement(("ns1", "http://www.example.com/ns1",
 "sample");

Example Adding attributes and namespaces one at a time

writer.add(eventFactory.createStartElement("ns1",
 "http://www.example.com/ns1", "sample", null, null));
writer.add(eventFactory.createNamespace("ns1", "http://www.example.com/ns1"));
writer.add(eventFactory.createNamespace("ns2", "http://www.example.com/ns2"));
writer.add(eventFactory.createAttribute("ns2", "http://www.example.com/ns2",
 "attribute", "true"));
writer.add(eventFactory.createEndElement(("ns1", "http://www.example.com/ns1",
 "sample");

Event objects are reusable; if you wanted to create the same element twice, adding an attribute to the second element, you would only have to create the StartElement and EndElement objects once:

StartElement start = eventFactory.createStartElement(new QName("element"), null,
 null);
EndElement end = eventFactory.createEndElement(new QName("element"), null,
 null);
writer.add(start);
writer.add(end);
writer.add(start);
writer.add(eventFactory.createAttribute("attribute", "value"));
writer.add(end);

Namespace support by XMLEventWriter is similar to XMLStreamWriter, but since there's no way to create an event that does not have a defined prefix, XMLEventWriter does not have the same namespace definition checking as XMLStreamWriter. Otherwise, the semantics of namespace handling are the same as in XMLStreamWriter.