RELAX NG
RELAX NG is, in many senses, the rebel child in the constraint family. While DTDs and XML Schema are both W3C specifications (or at least part of a specification, in the case of DTDs), RELAX NG is not endorsed or "blessed" by the W3C. And, even though it has been developed underneath the OASIS umbrella (http://www.oasis-open.org/home/index.php), RELAX NG is still seen as almost a grassroots effort to compete withor at least provide an alternative toXML Schema. Whatever you think about the political standing of RELAX NG, though, any good XML programmer should have RELAX NG in her constraint toolkit.
Constraining XML with RELAX NG
RELAX NG, like XML Schema, is pure XML. You start out by nesting everything within a grammar element:
<grammar xmlns="http://relaxng.org/ns/structure/1.0" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"> <!-- Content model for XML --> </grammar>
This sets up the namespace for all the elements you used, which are of course all part of the RELAX NG syntax. datatypeLibrary lets the schema know where to pull data types (covered in the "Data types" section later) from, when you type elements and attributes. You don't have to put this on your root element, but you'll find that's the best place to locate the reference; otherwise, you end up burying it somewhere in the middle of your schema, and that's a maintenance pain.
|
You'll find that most of the RELAX NG constructs are pretty intuitive; I'll run through the highlights.
Elements
You define elements using the element keyword, and nestings within an XML document are represented by nestings with the RELAX NG schema:
<element > <element > <element > <text/> </element> <element > <text/> </element> <!-- etc... --> </element> </element>
In fact, you should already be seeing one of the cooler features of RELAX NG: its structure closely mirrors the structure of the document it's constraining.
Cardinality and recurrence
In DTDs, you used *, +, and ? to indicate how many times an element can occur; XML Schema uses minOccurs and maxOccurs. In RELAX NG, you use elements, like zeroOrMore, oneOrMore, or optional:
<zeroOrMore> <element > <oneOrMore> <element > <element > <text/> </element> <optional> <element name="middleName> </optional> <element > <text/> </element> <!-- etc... --> </element> </oneOrMore> </element> </zeroOrMore>
This is a little different from anything you've seen so far, but turns out to be pretty simple to remember. Any element (or attribute) without a cardinality modifier like this is assumed to appear once and only once (just like in XML Schema).
Attributes
Attributes are equally easy to specify:
<zeroOrMore> <element > <oneOrMore> <element > <element > <text/> </element> <optional> <element name="middleName> </optional> <element > <text/> </element> <zeroOrMore> <element > <attribute > <choice> <value>home</value> <value>work</value> <!-- and so on --> </choice> </attribute> </element> </zeroOrMore> <!-- etc... --> </element> </oneOrMore> </element> </zeroOrMore>
I also tossed in the choice operator, which allows you to indicate specific values that are allowed for the attribute.
Data types
Last but not least (in this RELAX NG crash course, at least), you can type your data in RELAX NG, using the data element. Here's the definition of a point type, from the RELAX NG tutorial, for example:
<element datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"> <element > <data type="double"/> </element> <element > <data type="double"/> </element> </element>
|
When you use the string data type (instead of the <text/> tag) for an element, you have to specify the length allowed as well; for this reason, only use the string data type when you have a maximum length in mind:
<element > <data type="string"> <param >127</param> </data> </element>
That's not much on RELAX NG, but it's plenty to help you get started. As mentioned earlier, RELAX NG by Eric van der Vlist (Oracle) is available for a more in-depth look at the schema language.
Generating RELAX NG from an XML Instance
Relaxer (used previously in the "Generating DTDs from XML Instance Documents" and "Generating XML Schemas from Instance Documents" sections) handles RELAX NG schema generation easily enough, using the -rng option:
relaxer -rng toc.xml
The resultant RELAX NG schema, toc.rng, is shown in Example 2-5.
Example Relaxer generates RELAX NG schemas, in XML format, of course
<?xml version="1.0" encoding="UTF-8" ?> <grammar xmlns="http://relaxng.org/ns/structure/1.0" xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0" xmlns:java="http://www.relaxer.org/xmlns/relaxer/java" xmlns:relaxer="http://www.relaxer.org/xmlns/relaxer" xmlns:sql="http://www.relaxer.org/xmlns/relaxer/sql" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes" ns=""> <start> <ref /> </start> <define > <element > <attribute > <data type="token"/> </attribute> <oneOrMore> <ref /> </oneOrMore> </element> </define> <define > <element > <optional> <attribute > <data type="token"/> </attribute> </optional> <attribute > <data type="token"/> </attribute> <ref /> </element> </define> <define > <element > <attribute > <data type="token"/> </attribute> </element> </define> </grammar> |
Converting DTDs to RELAX NG Schemas
To convert DTDs to RELAX NG, use Sun's RELAX NG Converter, which you download from https://msv.dev.java.net. The RELAX NG Converter began as its own project, but is now part of Sun's Multi-Schema XML Validator project Move the downloaded and extracted folder into somewhere useful; I moved mine into /usr/local/java, and then renamed the folder (rngconv-20030225) to a more manageable name (rngconv). Then just run java and supply the converter JAR file as the module to run:
java -jar /usr/local/java/rngconv/rngconv.jar toc.dtd > toc-dtd.rng
|
The resultpiped into the toc-dtd.rng fileis shown in Example 2-6.
Example I'm not a fan of all the whitespace that RELAX NG Converter introduces, but other than that, it does a great job
<?xml version="1.0"?> <grammar ns="" xmlns="http://relaxng.org/ns/structure/1.0" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"> <start> <choice> <ref /> <element > <attribute > <data type="normalizedString"/> </attribute> <oneOrMore> <ref /> </oneOrMore> </element> <ref /> </choice> </start> <define > <element > <attribute > <data type="normalizedString"/> </attribute> </element> </define> <define > <element > <optional> <attribute > <data type="normalizedString"/> </attribute> </optional> <attribute > <data type="normalizedString"/> </attribute> <optional> <ref /> </optional> </element> </define> </grammar> |
Converting XML Schemas to RELAX NG Schemas
You can use the Sun RELAX NG Converter (see the previous section "Converting DTDs to RELAX NG Schemas" for installation instructions) for converting XML Schema to RELAX NG:
java -jar /usr/local/java/rngconv/rngconv.jar toc.xsd > toc.rng
Example 2-7 shows what the tool did with my XSD file as input.
Example Although the semantics are slightly different in a DTD schema, versus this one from an XML Schema, the constraints are remarkably similar
<?xml version="1.0"?> <grammar ns="" xmlns="http://relaxng.org/ns/structure/1.0" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"> <start> <choice> <notAllowed/> <element > <optional> <attribute > <data type="token"/> </attribute> </optional> <oneOrMore> <element > <optional> <attribute > <data type="token"/> </attribute> </optional> <optional> <attribute > <data type="token"/> </attribute> </optional> <element > <optional> <attribute > <data type="token"/> </attribute> </optional> </element> </element> </oneOrMore> </element> </choice> </start> </grammar> |
Validating XML Against a RELAX NG Schema
No surprise here; xmllint does the job once again (xmllint was introduced in "Validating XML Against a DTD" and used again in "Validating XML Against an XML Schema"). You just need to use the --relaxng option, and you're off to the races:
xmllint --relaxng toc-dtd.rng toc.xml --noout
|
I realize that I've assaulted you with tools in this chapter, but you just can't have enough conversion utilities in your back pocket. You never know when it will be easier to quickly convert a DTD to a RELAX NG schema, and work with that, rather than trying to reengineer a DTD; the same is true for validation, and working with XML Schema. And, this will all be even more useful as we begin to explore introducing Java into the XML equation.