Previous | Next
Schema EvolutionLooking beyond HTML generation, a key use for XSLT is transforming one form of XML into another form. In many cases, these are not radical transformations, but minor enhancements such as adding new attributes, changing the order of elements, or removing unused data. If you have only a handful of XML files to transform, it is a lot easier to simply edit the XML directly rather than going through the trouble of writing a stylesheet. But in cases where a large collection of XML documents exist, a single XSLT stylesheet can perform transformations on an entire library of XML files in a single pass. For B2B applications, schema evolution is useful when different customers require the same data, but in different formats. An Example XML FileLet's suppose that you wrote a logging API for your Java programs. Log files are written in XML and are formatted as shown in Example 3-10. Example 3-10. Log file before transformation<?xml version="1.0" encoding="UTF-8"?> <log> <message > <type>ERROR</type> <when> <year>2000</year> <month>01</month> <day>15</day> <hour>03</hour> <minute>12</minute> <second>18</second> </when> <where> <class>com.foobar.util.StringUtil</class> <method>reverse(String)</method> </where> </message> <message > <type>WARNING</type> <when> <year>2000</year> <month>01</month> <day>15</day> <hour>06</hour> <minute>35</minute> <second>44</second> </when> <where> <class>com.foobar.servlet.MainServlet</class> <method>init( )</method> </where> </message> <!-- more messages ... --> </log> As you can see from this example, the file format is quite verbose. Of particular concern is how the date and time are written. Since log files can be quite large, it would be a good idea to select a more concise format for this information. Additionally, the text is stored as an attribute on the <message type="WARNING"> <text>This is the text of a message. Multi-line messages are easier when an element is used instead of an attribute.</text> ...remainder omitted The Identity TransformationWhenever writing a schema evolution stylesheet, it is a good idea to start with an identity transformation . This is a very simple template that simply takes the original XML document and "transforms" it into a new document with the same elements and attributes as the original document. Example 3-11 shows a stylesheet that contains an identity transformation template. Example 3-11. identityTransformation.xslt<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> Amazingly, it takes only a single template to perform the identity transformation, regardless of the complexity of the XML data. Our stylesheet encodes the result using UTF-8 and indents lines, regardless of the original XML format. In XPath, <xsl:template match="@*|node( )"> Translated into English, this means that the template will match any attribute or any child node of the current context. Since Inside of our template, we use <xsl:apply-templates select="@*|node( )"/> Transforming Elements and AttributesOnce you have typed in the identity transformation and tested it, it is time to begin adding additional templates that actually perform the schema evolution. In XSLT, it is possible for two or more templates to match a pattern in the XML data. In these cases, the more specific template is instantiated. Without going into a great deal of technical detail, an explicit match such as In the log file example, a key problem is the quantity of XML data written for each <timestamp time="06:35:44" day="15" month="01" year="2000"/> The following template will perform the necessary transformation: <xsl:template match="when"> <!-- change 'when' into 'timestamp', and change its child elements into attributes --> <timestamp time="{hour}:{minute}:{second}" year="{year}" month="{month}" day="{day}"/> </xsl:template> This template can be added to the identity transformation stylesheet and will take precedence whenever a The next thing to tackle is the <!-- locate <message> elements --> <xsl:template match="message"> <!-- copy the current node, but not its attributes --> <xsl:copy> <!-- change the <type> element to an attribute --> <xsl:attribute name="type"> <xsl:value-of select="type"/> </xsl:attribute> <!-- change the text attribute to a child node --> <xsl:element name="text"> <xsl:value-of select="@text"/> </xsl:element> <!-- since the select attribute is not present, xsl:apply-templates processes all children of the current node. (not attributes or processing instructions!) --> <xsl:apply-templates/> </xsl:copy> </xsl:template> This almost completes the stylesheet. <xsl:template match="type"/> The complete schema evolution stylesheet simply contains the previous templates. Without duplicating all of the code, here is its overall structure: <?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/> <!-- the identity transformation --> <xsl:template match="@*|node( )"> ... </xsl:template> <!-- locate <message> elements --> <xsl:template match="message"> ... </xsl:template> <!-- locate <when> elements --> <xsl:template match="when"> ... </xsl:template> <!-- suppress the <type> element <xsl:template match="type"/> </xsl:stylesheet> The Result FileNow that the stylesheet is complete, it can be applied to all of the existing XML log files using a simple shell script or batch file. The resulting XML file is shown in Example 3-12. Example 3-12. Result of the transformation<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="schemachange.xslt"?> <log> <message type="ERROR"> <text>input parameter was null</text> <timestamp time="03:12:18" day="15" month="01" year="2000"/> <where> <class>com.foobar.util.StringUtil</class> <method>reverse(String)</method> </where> </message> <message type="WARNING"> <text>cannot read config file</text> <timestamp time="06:35:44" day="15" month="01" year="2000"/> <where> <class>com.foobar.servlet.MainServlet</class> <method>init( )</method> </where> </message> <message type="ERROR"> <text>negative duration is not allowed</text> <timestamp time="10:01:49" day="17" month="01" year="2000"/> <where> <class>com.foobar.util.DateUtil</class> <method>getWeek(int)</method> </where> </message> </log> |