RELAX NG is, in many senses, the rebel child in the constraint family. While DTDs and XML Schema are both W3C specifications (or at least part of a specification, in the case of DTDs), RELAX NG is not endorsed or "blessed" by the W3C. And, even though it has been developed underneath the OASIS umbrella (), RELAX NG is still seen as almost a grassroots effort to compete withor at least provide an alternative toXML Schema. Whatever you think about the political standing of RELAX NG, though, any good XML programmer should have RELAX NG in her constraint toolkit.

Constraining XML with RELAX NG

RELAX NG, like XML Schema, is pure XML. You start out by nesting everything within a grammar element:

<grammar xmlns="http://relaxng.org/ns/structure/1.0"
 datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes">
 <!-- Content model for XML -->
</grammar>

This sets up the namespace for all the elements you used, which are of course all part of the RELAX NG syntax. datatypeLibrary lets the schema know where to pull data types (covered in the " section later) from, when you type elements and attributes. You don't have to put this on your root element, but you'll find that's the best place to locate the reference; otherwise, you end up burying it somewhere in the middle of your schema, and that's a maintenance pain.

Java Tip Like the XML Schema specification, you should always use the same URI for the namespace here ().

You'll find that most of the RELAX NG constructs are pretty intuitive; I'll run through the highlights.

Elements

You define elements using the element keyword, and nestings within an XML document are represented by nestings with the RELAX NG schema:

<element >
 <element >
 <element >
 <text/>
 </element>
 <element >
 <text/>
 </element>
 <!-- etc... --> </element>
</element>

In fact, you should already be seeing one of the cooler features of RELAX NG: its structure closely mirrors the structure of the document it's constraining.

Cardinality and recurrence

In DTDs, you used *, +, and ? to indicate how many times an element can occur; XML Schema uses minOccurs and maxOccurs. In RELAX NG, you use elements, like zeroOrMore, oneOrMore, or optional:

<zeroOrMore>
 <element >
 <oneOrMore>
 <element >
 <element >
 <text/>
 </element>
 <optional>
 <element name="middleName>
 </optional>
 <element >
 <text/>
 </element>
 <!-- etc... --> </element>
 </oneOrMore>
 </element>
</zeroOrMore>

This is a little different from anything you've seen so far, but turns out to be pretty simple to remember. Any element (or attribute) without a cardinality modifier like this is assumed to appear once and only once (just like in XML Schema).

Attributes

Attributes are equally easy to specify:

<zeroOrMore>
 <element >
 <oneOrMore>
 <element >
 <element >
 <text/>
 </element>
 <optional>
 <element name="middleName>
 </optional>
 <element >
 <text/>
 </element>
 <zeroOrMore>
 <element >
 <attribute >
 <choice>
 <value>home</value>
 <value>work</value>
 <!-- and so on -->
 </choice>
 </attribute>
 </element>
 </zeroOrMore>
 <!-- etc... --> </element>
 </oneOrMore>
 </element>
</zeroOrMore>

I also tossed in the choice operator, which allows you to indicate specific values that are allowed for the attribute.

Data types

Last but not least (in this RELAX NG crash course, at least), you can type your data in RELAX NG, using the data element. Here's the definition of a point type, from the RELAX NG tutorial, for example:

<element datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes">
 <element >
 <data type="double"/>
 </element>
 <element >
 <data type="double"/>
 </element>
</element>
Java Tip If you placed the datatypeLibrary attribute on your root element, then you don't need to repeat that declaration here.

When you use the string data type (instead of the <text/> tag) for an element, you have to specify the length allowed as well; for this reason, only use the string data type when you have a maximum length in mind:

<element >
 <data type="string">
 <param >127</param>
 </data>
</element>

That's not much on RELAX NG, but it's plenty to help you get started. As mentioned earlier, RELAX NG by Eric van der Vlist (Oracle) is available for a more in-depth look at the schema language.

Generating RELAX NG from an XML Instance

Relaxer (used previously in the " and " sections) handles RELAX NG schema generation easily enough, using the -rng option:

relaxer -rng toc.xml

The resultant RELAX NG schema, toc.rng, is shown in .

Example Relaxer generates RELAX NG schemas, in XML format, of course

<?xml version="1.0" encoding="UTF-8" ?>
<grammar xmlns="http://relaxng.org/ns/structure/1.0"
 xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"
 xmlns:java="http://www.relaxer.org/xmlns/relaxer/java"
 xmlns:relaxer="http://www.relaxer.org/xmlns/relaxer"
 xmlns:sql="http://www.relaxer.org/xmlns/relaxer/sql"
 datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"
 ns="">
 <start>
 <ref />
 </start>
 <define >
 <element >
 <attribute >
 <data type="token"/>
 </attribute>
 <oneOrMore>
 <ref />
 </oneOrMore>
 </element>
 </define>
 <define >
 <element >
 <optional>
 <attribute >
 <data type="token"/>
 </attribute>
 </optional>
 <attribute >
 <data type="token"/>
 </attribute>
 <ref />
 </element>
 </define>
 <define >
 <element >
 <attribute >
 <data type="token"/>
 </attribute>
 </element>
 </define>
</grammar>

Converting DTDs to RELAX NG Schemas

To convert DTDs to RELAX NG, use Sun's RELAX NG Converter, which you download from . The RELAX NG Converter began as its own project, but is now part of Sun's Multi-Schema XML Validator project Move the downloaded and extracted folder into somewhere useful; I moved mine into /usr/local/java, and then renamed the folder (rngconv-20030225) to a more manageable name (rngconv). Then just run java and supply the converter JAR file as the module to run:

java -jar /usr/local/java/rngconv/rngconv.jar toc.dtd > toc-dtd.rng
Java Warning Every bit of documentation I found insisted you supply RELAX NG Converter the -dtd flag for converting DTDs, but I couldn't get the tool to work with that flag. It's only when I removed the flag that I had successful results.

The resultpiped into the toc-dtd.rng fileis shown in .

Example I'm not a fan of all the whitespace that RELAX NG Converter introduces, but other than that, it does a great job

<?xml version="1.0"?>
<grammar ns="" xmlns="http://relaxng.org/ns/structure/1.0" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes">
 <start>
 <choice>
 <ref />
 <element >
 <attribute >
 <data type="normalizedString"/>
 </attribute>
 <oneOrMore>
 <ref />
 </oneOrMore>
 </element>
 <ref />
 </choice>
 </start>
 <define >
 <element >
 <attribute >
 <data type="normalizedString"/>
 </attribute>
 </element>
 </define>
 <define >
 <element >
 <optional>
 <attribute >
 <data type="normalizedString"/>
 </attribute>
 </optional>
 <attribute >
 <data type="normalizedString"/>
 </attribute>
 <optional>
 <ref />
 </optional>
 </element>
 </define>
</grammar>

Converting XML Schemas to RELAX NG Schemas

You can use the Sun RELAX NG Converter (see the previous section " for installation instructions) for converting XML Schema to RELAX NG:

java -jar /usr/local/java/rngconv/rngconv.jar toc.xsd > toc.rng

shows what the tool did with my XSD file as input.

Example Although the semantics are slightly different in a DTD schema, versus this one from an XML Schema, the constraints are remarkably similar

<?xml version="1.0"?>
<grammar ns="" xmlns="http://relaxng.org/ns/structure/1.0" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes">
 <start>
 <choice>
 <notAllowed/>
 <element >
 <optional>
 <attribute >
 <data type="token"/>
 </attribute>
 </optional>
 <oneOrMore>
 <element >
 <optional>
 <attribute >
 <data type="token"/>
 </attribute>
 </optional>
 <optional>
 <attribute >
 <data type="token"/>
 </attribute>
 </optional>
 <element >
 <optional>
 <attribute >
 <data type="token"/>
 </attribute>
 </optional>
 </element>
 </element>
 </oneOrMore>
 </element>
 </choice>
 </start>
</grammar>

Validating XML Against a RELAX NG Schema

No surprise here; xmllint does the job once again (xmllint was introduced in " and used again in "). You just need to use the --relaxng option, and you're off to the races:

xmllint --relaxng toc-dtd.rng toc.xml --noout
Java Tip You may have noticed that I supplied no instructions for referencing a RELAX NG schema in your XML document. That's because there's no need to; the schema to use is controlled by the tool validating, rather than the input document.

I realize that I've assaulted you with tools in this chapter, but you just can't have enough conversion utilities in your back pocket. You never know when it will be easier to quickly convert a DTD to a RELAX NG schema, and work with that, rather than trying to reengineer a DTD; the same is true for validation, and working with XML Schema. And, this will all be even more useful as we begin to explore introducing Java into the XML equation.