| Previous | Next
Schema BasicsThis section will construct, step by step, a simple schema document representing a typical address tutorial entry, introducing different features of the XML Schema language as needed. Example 16-1 shows a very simple well-formed XML document. Example 16-1. addressdoc.xml<?xml version="1.0"?> <fullName>Scott Means</fullName> Assuming that the Example 16-2. address-schema.xsd<?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="fullName" type="xs:string"/> </xs:schema> It is also common to associate the sample instance document explicitly with the schema document. Since the Example 16-3. addressdoc.xml with schema reference<?xml version="1.0"?> <fullName xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="address-schema.xsd">Scott Means</fullName> Validating the simple document against its schema requires a validating XML parser that supports schemas such as the open source Xerces parser from the Apache XML Project (http://xml.apache.org/xerces-j/ ). This is written in Java and includes a command-line program called % java dom.DOMWriter -V -S addressdoc.xml Since the document is valid, <?xml version="1.0"?> <fullName xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="address-schema.xsd" >Scott <b>Means</b></fullName> If this document were validated with [Error] addressdoc.xml:4:13: Element type "b" must be declared. [Error] addressdoc.xml:4:31: Datatype error: In element 'fullName' : Can not have element children within a simple type content. Document OrganizationNow that there is a basic schema and a valid document from which to work, it is time to examine the structure of a schema document and its contents. Every schema document consists of a single root TIP: The XML elements that make up an XML schema must belong to the XML Schema namespace (http://www.w3.org/2001/XMLSchema), which is frequently associated with the Instance elements declared by top-level elements in the schema (immediate child elements of the In this case, since only one element has been declared, that shouldn't be a problem. But when building more complex schemas, this side effect must be taken into consideration. If more than one element is declared globally, a schema-valid document may not contain the root element you expect. Naming conflicts are another potential problem with multiple global declarations. When writing schema declarations, it is an error to declare two things of the same type at the same scope. For instance, trying to declare two global elements called AnnotationsNow that there is a working schema, it's good practice to include some documentary material about who authored it, what it was for, any copyleft restrictions, etc. Since an XML schema document is an XML document in its own right, one simple option would be to use XML comments to include documentary information. The major drawback to using XML comments is that parsers are not obliged to keep comments intact when parsing XML documents, and applications have to do a lot of work to negotiate their internal structures. This increases the likelihood that, at some point, important documentation will be lost during an otherwise harmless transformation or editing procedure. Encoding documentation as markup inline with the element and type declarations they refer to opens up endless possibilities for automatic documentation generation. To accommodate this extra information, most schema elements may contain an optional The xs:documentation elementAs a concrete example, let's add some authorship and copyleft information to the simple schema document, as shown in Example 16-4. Example 16-4. address-schema.xsd with annotation<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:annotation> <xs:documentation xml:lang="en-us"> Simple schema example from Anonymous's <a href="http://wikipedia.org/w/index.php?search=xmlnut">XML tutorial.</a> copyleft 2002 Anonymous & Associates </xs:documentation> </xs:annotation> <xs:element name="fullName" type="xs:string"/> </xs:schema> The Also, notice that the documentation element contains additional markup: an The xs:appinfo elementIn reality, there is little difference between the For example, let's say that it is necessary to encode context-sensitive help text with each of the elements declared in a schema. This text might be used to generate tool-tips in a GUI or system prompts in a voicemail system. Either way, it would be very convenient to associate this information directly with the particular element in question using the <xs:element name="fullName" type="xs:string"> <xs:annotation> <xs:appinfo> <help-text>Enter the person's full name.</help-text> </xs:appinfo> </xs:annotation> </xs:element> . . . Although schemas allow very sophisticated and powerful rules to be expressed, they cannot possibly encompass every conceivable need that a schema developer might face. That is why it is important to remember that there is a facility that can be used to include your own application-specific information directly within the actual schema declarations. TIP: Schematron is especially well-suited to use in annotations and is capable of checking a wide variety of conditions well beyond the bounds of XML Schema. For more information about Schematron, see http://www.ascc.net/xml/resource/schematron/schematron.html. Element DeclarationsXML documents are composed primarily of nested elements, and the <xs:element name="fullName" type="xs:string"> This declaration uses two attributes to describe the element that can appear in the instance document: Simple TypesSchemas support two different types of content: simple and complex. Simple content equates with basic data types that are found in most modern developing languages (strings, integers, dates, times, etc.). Simple types cannot, by definition, contain nested element content. In the previous example, the Table 16-1. Built-in simple schema typesSince attribute values cannot contain elements, attributes must always be declared with simple types. Also, an element that is declared to have a simple type cannot have any attributes. This means that if an attribute must be added to the Attribute DeclarationsTo make the Attributes are declared using the To incorporate a <xs:element name="fullName"> <xs:complexType> <xs:simpleContent> <xs:extension base="xs:string"> <xs:attribute name="language" type="xs:language"/> </xs:extension> </xs:simpleContent> </xs:complexType> </xs:element> This declaration no longer has a Attribute groupsIn DTDs, parameter entities are used to encapsulate repeated groups of attribute declarations that are shared between different element types. Schemas provide the same functionality in a more formal fashion using the An attribute group is simply a named group of Within the <xs:element name="fullName"> . . . <xs:extension base="xs:string"> <xs:attributeGroup ref="nationality"/> </xs:extension> . . . </xs:element> <xs:attributeGroup name="nationality"> <xs:attribute name="language" type="xs:language"/> </xs:attributeGroup> |