Configuring XMLReader Behavior

A configuration mechanism was one of the key features added in the SAX2 release. Parsers can support extensible sets of named Boolean feature flags and property objects. These function in similar ways, including using URIs to identify any number of features and properties. The exception model, presented in "Introducing SAX2" in "SAX2 Feature Flags" is used to distinguish the three basic types of feature or property: the current value may be read-only, read/write, or undefined. Some flags and properties may have rules about when they can be changed (typically not while parsing) or read.

Applications access property objects and feature flags through get*() and set*() methods and use URIs to identify the characteristic of interest. Since SAX does not provide a way to enumerate such URIs as supported by a parser, you will need to rely on parser documentation, or the tables in this section, to identify the legal identifiers. (Or consult the source code, if you have access to it.)

If you happen to be defining new handlers or features using the SAX2 framework, you don't have to ask for permission to define new property or feature flag IDs. Since they are identified using URIs, just start your ID with a base URI that you control. (Only the SAX maintainers would start with the http://xml.org/sax/ URI, for example.) Typically, it will be easiest to make up some HTTP URL based on a fully qualified domain name that you control. As with namespace URIs, these are used purely as identifiers rather than as locations from which data would be retrieved. (The "I" in URI stands for "identifier.")

XMLReader Properties

SAX2 defines two XMLReader calls for accessing named property objects. One of the most common uses for such objects is to install non-core event handlers. Accessing properties is like accessing feature flags, except that the values associated with these names are objects rather than Booleans:

XMLReader producer ...; String uri = ...; Object value = ...; // Try getting and setting the property try {
 System.out.println ("Initial property setting: " + producer.getProperty (uri); // if we get here, the property is supported producer.setProperty (uri, value); // if we get here, the parser set the property
}
catch (SAXNotSupportedException e) {
 // bad value for property ... maybe wrong type, or parser state System.out.println ("Can't set property: " + e.getMessage ()); System.exit (1);
}
catch (SAXNotRecognizedException e) {
 // property not supported by this parser System.out.println ("Doesn't understand property: " + e.getMessage ()); System.exit (1);
}

You'll notice the URIs for these standard properties happen to have a common prefix. This means that you can declare the prefix (http://xml.org/sax/properties/) as a constant string and construct the identifiers by string catenation.

Here are the standard properties:

Applications may find it useful to define their own types of handler interfaces, assembling sequences of SAX event "atoms" into higher-level event "molecules" that incorporate essential application-level semantics (and probably some procedural validation). This is the same kind of process model used by W3C's XML schema processing model: the Post-Schema-Validation Infoset (PSVI) additions incorporate semantics suited to processing with that kind of schema. Most applications need to associate even more semantics with data than are easily captured by such simple rules (including DTDs and all types of schema). Those semantics would likely not be understood by any common XMLReader, but other kinds of SAX processing components can help manage such application-level handlers. You can see an example of this technique in Example 6-3.

XMLReader Feature Flags

The previous chapter showed how to access feature flags from SAX parsers and used the standard validation flag as the primary example. Accessing feature flags follows the same model as accessing properties, except the values are boolean not Object. There are a handful of standard SAX2 feature flags, which are all you normally need. The namespace for features is different from the namespace for properties. You can't set a property to a java.lang.Boolean value and expect to have the same effect as setting the feature flag that happens to use the same identifier.

As with properties, the URIs for these standard feature flags happen to have a common prefix: http://xml.org/sax/features/. It's good developing practice to declare the prefix as a constant and construct these feature identifiers by string catenation, helping reduce errors. Also, remember that flags aren't necessarily either settable (read/write)[17] or readable (supported); some parsers won't recognize all these flags, and in some cases these flags expose parser behaviors that don't change.

[17]SAX could support write-only flags too, but these are rarely a good idea.

The standard flags are as follows:

A few additional standard extension features will likely be defined, providing even more complete Infoset support from SAX2 XML parsers.Ælfred also includes a nonvalidating parser, which supports only false for this flag.

Of the widely available parsers, only Xerces has nonstandard feature flags. (The Xerces distribution includes full documentation for those flags.) As a rule, avoid most of these, because they are parser-specific and even version-specific. Some are used to disable warnings about extra definitions that aren't errors. (Most parsers don't bother reporting such nonerrors; Xerces reports them by default.) Others promote noncompliant XML validation semantics. Here are a few flags that you may want to use.