Introduction to documents structured as XML data
In a data-structured document, the document semantics (the meaning of the content) can be defined separately from the document display (the presentation of the content). For example, you can use Microsoft Office Word to create an invoice where you define a number as a Total. You don't rely on formatting, such as boldface, or placement, such as the bottom row of a table to identify the number as a Total. As a data-structured document, the invoice contains semantically identifiable components, including the Total, that can be extracted and processed, independent of Office Word.
This fluidity of data in documents is possible because the data follows the rules of Extensible Markup Language (XML), an open, nonproprietary protocol for structuring data.
Office Word provides capabilities for creating data-structured documents to accommodate three kinds of scenarios:
Automate document processes by using the default file format
You can create data-structured documents simply by saving documents in Office Word using the default file format, which is the Office Open XML Format. Because the content, formatting, and document properties are each stored separately in an open format, the documents stored in this way can easily be updated, customized, and even generated by automated processes outside of Office Word.
For example, if your company changes its location, you can update a large number of documents with the new address. An automated, server-based process can replace all the occurrences of the old address with the new one in a batch process without ever opening Office Word.
Note For more information about the XML-based file format in Office Word, see Introduction to new file name extensions and Open XML Formats.
Incorporate custom business processes into documents
If your business processes rely on information that is stored in a data source such as a document library on a Microsoft Windows SharePoint Services 3.0 site, you can incorporate this information directly in your documents. When you map data fields (called content controls) to custom data that is stored as XML, your document becomes a front end for working with the data source.
For example, consider a document library that stores project plans by project name, and where each project is associated with a project manager. To automatically assign project plans to the appropriate project manager, the project plan template includes a content control where users select the project name from a list. When the project plan is saved in the library, it is automatically assigned to the appropriate project manager, based on the project name that the user specified in the document.
Furthermore, if the document library properties are updated so that different project managers are associated with the projects, the project plans stay up to date. They don't need to be updated individually in Office Word. Content controls provide two-way data maintenance. In the document, content controls can both update the data in the library as well as display up-to-date data from the library.
Note For instructions on how to add content controls to a document, see Create forms that users complete in Word.
Tag content with XML elements based on an XML Schema
If your organization has a custom XML Schema that defines the data structure that you use, you can attach the Schema to a document and then mark up the content of the document with the custom XML elements. Office Word can then validate the data according to the custom XML Schema that you attached to the document.
Custom data, rich formatting
For example, suppose that you provide IT services at a bank that prints a data sheet summarizing the bank's various account types, loans, and investment offerings. You could attach a custom XML Schema that defines InterestRate and other data as XML elements.
By saving these data sheets as Office Word documents with a custom schema attached, bank employees can format the sheets as printed matter to distribute to customers at the same time that data within the sheets can be processed and updated by any program that reads XML. For example, an automated process could automatically fetch the current information for interest rate on a daily basis, update the InterestRate XML element, and calculate the updated information so that the data sheets are always up to date.
Note For more information about tagging a document with custom XML elements, see Create an XML document based on a custom Schema.