Document Type Definition

Before XML Schema came into existence in May 2001, XML documents were described using an alternate form: document type definition (DTD). DTD came into existence before XML was used for RPCs and before XML was used to represent complex business data. DTDs were used to describe documents that were primarily for human consumption (as opposed to machine consumption) and are therefore inadequate to describe the complex data structures used in Web services.

The main deficiencies of DTDs in comparison to XML Schema are:

Java Start Sidebar
Validating Parsers

All examples in this appendix were verified using the Apache Xerces validating parser, which comes with Sun's Java Web Service Developer Pack 1.0 (Java WSDP). Microsoft MSXML is another popular parser. discusses using Java WSDP for parsing documents.

Java End Sidebar

A brief introduction to DTD is in order. Screenshot A.2 shows what the DTD for Listing A.1 looks like.

Java Click To expand
Screenshot A.2: Simple DTD for the employeeList document

The DTD for employeeList XML starts out by identifying itself as such: the first entry, !DOCTYPE employeeList, identifies the document as the DTD (document type) for the root element employeeList. Subsequent entries inform the parser that the employeeList element may contain zero or more employee elements and that each employee element must contain the employee_id, name, extn, and dept elements, in that order. The !ATTLIST statement enforces the rule that the employee element must have a type attribute with only two allowable values, perm or contract. The name element has a description similar to the employee element, in that it consists of two other elements: first_name and last_name. The statement !ELEMENT email (#PCDATA) signifies that the email element can have any parsable character data (a string value) (as explained earlier, DTDs cannot be used to describe many of the datatypes used in coding languages).

While this DTD expresses some of Flute Bank's business rules, it is inadequate to represent more complex business rules. For example, it is inadequate to express that a department number must be the format "XXX-XXX-XXXX", that all valid telephone extensions in Flute Bank are five digits, and that employee IDs range from 1 to 100,000. The limitations of DTDs in the context of describing database and object-oriented coding datatypes and constraints necessitated a new, XML-based description specification: XML Schema definition language.