StAX Basics
StAX is an acronym for Streaming API for XML. It is Java Specification Recommendation (JSR) 173, sponsored by BEA with the goal of standardizing the various pull parser implementations that had been created in the absence of a Java or W3C standard. StAX provides interfaces for parsing XML documents as well as producing them. The JSR should actually be titled "Streaming APIs for XML" because StAX encompasses two distinct APIs. The specification refers to these as the cursor API and the event iterator API. According to the specification, the objective of the cursor API is "[t]o allow users to read and write XML as efficiently as possible," whereas for the event iterator API, it's "to be easy to use, event based, easy to extend, and allow easy pipelining." This implies a greater difference between the APIs than actually exists, as we'll see throughout this chapter. The specific interfaces for the cursor API are XMLStreamReader and XMLStreamWriter. For the event iterator API, these interfaces are XMLEventReader and XMLEventWriter. All of these interfaces are in the package javax.xml.stream. In the cursor API interfaces, methods on the reader or writer object itself allow the developer to obtain information or add new content to the XML document. This is referred to as the cursor API, as it is similar to how database cursors work. In the event iterator API, you obtain event objects from the reader or add event objects to the writer. This strongly typed event object contains only the methods appropriate for that type of event. In most implementations, the XMLEventReader implementation uses XMLStreamReader under the hood and, likewise, XMLEventWriter uses XMLStreamWriter. The final release of the StAX specification, API, and JavaDocs can be downloaded from http://jcp.org/en/jsr/detail?id=173.
StAX Event Types
Whether using the cursor or event interfaces, StAX defines the same set of events that will occur while traversing the document. As part of the API, each of these is assigned an int. These types are defined in javax.xml.stream.XmlStreamConstants and are listed in Table 8-1.
Table 8-1. StAX event types
Event type ID | Event type name |
---|---|
1 |
START_ELEMENT |
2 |
END_ELEMENT |
3 |
PROCESSING_INSTRUCTION |
4 |
CHARACTERS |
5 |
COMMENT |
6 |
SPACE |
7 |
START_DOCUMENT |
8 |
END_DOCUMENT |
9 |
ENTITY_REFERENCE |
10 |
ATTRIBUTE |
11 |
DTD |
12 |
CDATA |
13 |
NAMESPACE |
14 |
NOTATION_DECLARATION |
15 |
ENTITY_DECLARATION |
After reading the first few chapters of this tutorial, the meaning of these events should be obvious. Table 8-2 contains the correlation between the events and the SAX methods we have already discussed.
Table 8-2. Correlation between StAX events and SAX handler methods
StAX event type name | SAX handler name | SAX handler method |
---|---|---|
START_ELEMENT |
ContentHandler |
startElement( ) |
END_ELEMENT |
ContentHandler |
endElement( ) |
PROCESSING_INSTRUCTION |
ContentHandler |
processingInstruction( ) |
CHARACTERS |
ContentHandler |
characters( ) |
COMMENT |
LexicalHandler |
comment( ) |
SPACE |
ContentHandler |
ignorableWhitespace( ) |
START_DOCUMENT |
ContentHandler |
startDocument( ) |
END_DOCUMENT |
ContentHandler |
endDocument( ) |
ENTITY_REFERENCE |
ContentHandler |
skippedEntity( ) |
ATTRIBUTE | n/a | n/a |
DTD |
LexicalHandler |
startDTD()/endDTD( ) |
CDATA | n/a | n/a |
NAMESPACE |
ContentHandler |
startPrefixMapping( ) |
NOTATION_DECLARATION |
DTDHandler |
notationDecl( ) |
ENTITY_DECLARATION |
DTDHandler |
unparsedEntityDecl( ) |
Obtaining a StAX Implementation
StAX is simply an API specification, not an implementation. You can write and compile code using only the JAR included with the specification. However, to run the compiled code, you'll need a StAX implementation. As of Java SE 6, a StAX implementation is included. For prior versions of the JRE, you'll need to download an implementation. The reference implementation for StAX is available at http://stax.codehaus.org. From this web site, you can download two JAR files. One contains the API interfaces and classes; it is the same JAR available via the JSR web site. The second contains the actual reference implementation. As of the time of this writing, the latest API JAR was stax-api-1.0.1.jar and the latest implementation JAR was stax-1.2.0_rc2-dev.jar. For production apps, I strongly recommend the Sun Java Streaming XML Parser (SJSXP) instead of the reference implementation. SJSXP is available either from https://sjsxp.dev.java.net or as part of the Java Web Services Developer Pack (JWSDP), downloadable from http://java.oracle.com/webservices/jwsdp. We'll talk more about JWSDP in .