You may have heard of Extensible Markup Language (XML), and you may have heard many reasons why your organization should use it. But what is XML, exactly? This article explains the basics of XML - what it is and how it works.In this article

A brief look at mark up, markup, and tags

To understand XML, it helps to understand the idea of marking up data. People have created documents for centuries, and for just as long they have marked up those documents. For example, school teachers mark up student papers all of the time. They tell students to move paragraphs, clarify sentences, correct misspellings, and so on. Marking up a document is how we define the structure, meaning, and visual appearance of the information in the document. If you have ever used the Track Changes feature in Microsoft Office Word, you have used a computerized form of mark up.

In computing, "mark up" has also evolved into "markup." Markup is the process of using codes called tags (or sometimes tokens) to define the structure, the visual appearance, and - in the case of XML - the meaning of any data.

The HTML code for this article is a good example of computer markup at work. If you browse through it (in Microsoft Internet Explorer, right-click the page, and then click View Source), you will see a mix of readable text and Hypertext Markup Language (HTML) tags, such as <p> and <h2>. Tags in HTML and XML documents are easy to recognize because they are surrounded by angle brackets. In the source code for this article, the HTML tags do a variety of jobs, such as define the beginning and end of each paragraph (<p> ... </p>) and mark the location of each image.

So what makes it XML?

HTML and XML documents contain data that is surrounded with tags, but that is where the similarities between the two languages end. In HTML, the tags define the look and feel of your data - the headlines go here, the paragraph starts there, and so on. In XML the tags define the structure and meaning of your data - what the data is.

When you describe the structure and meaning of your data, you make it possible to reuse that data in any number of ways. For example, if you have a block of sales data and each item in the block is clearly identified, you can load just the items that you need into a sales report and load other items into an accounting database. Put another way, you can use one system to generate your data and mark it up with XML tags, and then process that data in any number of other systems, regardless of the hardware platform or operating system. That portability is why XML has become one of the most popular technologies for exchanging data.

Remember these facts as you proceed:

You can see that XML tags make it possible to know exactly what kind of data that you are looking at. For example, you know this is data about a cat, and you can easily find the cat's name, age, and so on. The ability to create tags that define almost any data structure is what makes XML "extensible."

But don't confuse the tags in that code sample with tags in an HTML file. For instance, if you paste that XML structure into an HTML file and view the file in your browser, the results will look something like this:

Izzy Siamese 6 yes no Izz138bod Colin Wilcox

The browser ignores your XML tags and displays just the data.

A word about well-formed data

You may hear someone from your IT department mention "well-formed" XML. A well-formed XML file conforms to a set of very strict rules that govern XML. If a file doesn't conform to those rules, XML stops working. For example, in the previous code sample, every opening tag has a closing tag, so the sample adheres to one of the rules for being well-formed. If you remove a tag and try to open that file in one of the Office programs, you will see an error message, and the program will stop you from using the file.

You don't necessarily need to know the rules for creating well-formed XML (though they are easy to understand), but you do need to remember that you can share XML data among programs and systems only if that data is well-formed. If you can't open an XML file, chances are that file isn't well-formed.

XML is also platform-independent, meaning that any program built to use XML can read and process your XML data, regardless of the hardware or operating system. For example, with the right XML tags, you can use a desktop program to open and work with data from a mainframe computer. And, regardless of who creates a body of XML data, you can work with the same data in several of the Microsoft Office 2003 and Microsoft Office Professional programs, including Microsoft Office Access, Microsoft Office Word, Microsoft Office InfoPath, and Microsoft Office Excel. Because it is so portable, XML has become one of the most popular technologies for exchanging data between databases and user desktops.

xml in use by other programs

In addition to tagged, well-formed data, XML systems typically use two additional components: schemas and transforms. The following sections explain how these additional components work.

A quick look at schemas

Don't let the term "schema" intimidate you. A schema is just an XML file that contains the rules for what can and cannot reside in an XML data file. Schema files typically use the .xsd file name extension, while XML data files use the .xml extension.

Schemas allow programs to validate data. They provide the framework for structuring data and ensuring that it makes sense to the creator and any other users. For example, if a user enters invalid data, such as text in a date field, the program can prompt the user to enter the correct data. As long as the data in an XML file conforms to the rules in a given schema, any program that supports XML can use that schema to read, interpret, and process the data. For example, as shown in the following illustration, Excel and Word can validate the <CAT> data against the CAT schema.

schemas enable applications to share xml data.

Schemas can become complex, and teaching you how to create one is beyond the scope of this article. (Besides, you probably have an IT department that knows how.) However, it helps to know what schemas look like. The following schema defines the rules for the <CAT> ... </CAT> tag set.

 <xsd:element name="cat"> <xsd:complexType> <xsd:sequence> <xsd:element name="name" type="xsd:string"/> <xsd:element name="breed" type="xsd:string"/> <xsd:element name="age" type="xsd:positiveinteger"/> <xsd:element name="altered" type="xsd:boolean"/> <xsd:element name="declawed" type="xsd:boolean"/> <xsd:element name="license" type="xsd:string"/> <xsd:element name="owner" type="xsd:string"/> </xsd:sequence> </xsd:complexType> </xsd:element>

Don't worry about understanding everything in the sample. Just keep these facts in mind:

A quick look at transforms

As we mentioned earlier, XML also provides powerful ways to use or reuse data. The mechanism for reusing data is called an Extensible Stylesheet Language Transformation (XSLT), or simply, a transform. Transforms are where XML can really get interesting. For example, after you validate a data file against a schema, you can apply a transform that makes the data work as a marketing brochure in Microsoft Office Word 2003 and apply another transform to create a sales report in Office Excel.

You (okay, your IT department) can also use transforms to exchange data between back-end systems, such as databases. For instance, say that Database A stores the sales data in a table structure that works well for the sales department. Database B stores the revenue and expense data in a table structure that is tailored for the accounting department. Database B can use a transform to accept data from A and write that data to the correct tables.

The combination of data file, schema, and transform constitutes a basic XML system. The following illustration shows how such systems typically work. The data file is validated against the schema and then rendered in any number of usable ways by a transform. In this case, the transform deploys the data to a table in a Web page.

a basic xml file structure with a schema and transform

The following code sample shows one way to write a transform. It loads the <CAT> data into a table on a Web page. Again, the point of the sample isn't to show you how to write a transform, but to show you one form that a transform can take.

 <?xml version="1.0"?> <xsl:stylesheet version="1.0"> <TABLE> <TR> <TH>Name</TH> <TH>Breed</TH> <TH>Age</TH> <TH>Altered</TH> <TH>Declawed</TH> <TH>License</TH> <TH>Owner</TH> </TR> <xsl:for-each select="cat"> <TR align="left" valign="top"> <TD> <xsl:value-of select="name"/> </TD> <TD> <xsl:value-of select="breed"/> </TD> <TD> <xsl:value-of select="age"/> </TD> <TD> <xsl:value-of select="altered"/> </TD> <TD> <xsl:value-of select="declawed"/> </TD> <TD> <xsl:value-of select="license"/> </TD> <TD> <xsl:value-of select="owner"/> </TD> </TR> </xsl:for-each> </TABLE>

This sample shows how one type of transform might look when it is coded, but remember that you can just describe what you need from the data in plain English. For example, you can go to your IT department and say that you need to print the sales data for particular regions for the past two years, "and I need it to look this way." Your IT department can then write (or change) a transform to do that job.

What makes all of this even more convenient is that Microsoft and a growing number of other vendors are creating transforms for jobs of all sorts. In the future, chances are that you will be able to download a transform that either meets your needs or that you can adjust to suit your purpose. That means XML will cost less to use over time.

A peek at XML in the Microsoft Office System

The professional editions of Microsoft Office 2003 and Office release provide extensive XML support.

So far so good, but what if you have XML data with no schema? The Office programs that support XML have their own approaches to helping you work with the data. For instance, if you open an XML file in Word without an attached schema, Word displays the tags and data and enables you to apply a transform if, for example, the file's creator or your IT department provides one. At the least, you can read the tags and data in the file.

In contrast, Excel infers a schema if you open an XML file that doesn't already have one. Excel then gives you the option of loading this data into a read-only file or of mapping the data into either an XML list (in Microsoft Office Excel 2003) or an XML table (in Office Excel). You can use the XML lists and tables to sort, filter, or add calculations to the data.

Office Professional and Microsoft Office 2003 provide the same sets of XML tools. In Office Professional, you must first enable XML support, and then you start the tools from different locations. However, after you start the tools, they work the same in Microsoft Office 2003 and Office Professional. The following steps explain how to start the XML tools for Office Excel and Office Word.

Note Microsoft Office Access enables its XML tools by default, so you can skip the first steps if you use Access.

Enable the XML tools in Office Excel and Office Word

  1. In Excel or Word, click the Microsoft Office Button button image, and then click Excel Options or Word Options, depending on the program that you have open.
  2. Click Personalize.
  3. Under Top options for working with application name, select Show Developer tab in the Ribbon, and then click OK.

Start the XML tools in Office Excel and Office Word

Start the XML tools in Office Access

  1. Click the External Data tab.
  2. Do one of the following:
    • In the Import group, click XML File.
    • In the Export group, click More, and then click XML File.

More information

The links in the following sections take you to information about using XML in various Office programs and about writing XML code.

showUsing XML in Office release

Note The links in this section will change as the Office team creates and publishes more content.

showUsing XML in Microsoft Office 2003

Note Some of the links in this section go to the Microsoft Office Online Web site, and some go to the Microsoft Developer Network (MSDN).

Online Training

General

Access

Excel

  • How to use XML in Excel 2003

FrontPage

InfoPath

  • How InfoPath uses XML technologies

Visio

  • About XML for Visio

Word

  • About XML documents in Word
  • Attach or separate an XML schema and a document

showWriting XML code

showBooks about XML

For beginners

For developers and IT specialists