9.1 XSL and XSLT

You have discovered the weakness of using CSS to style an XML document: CSS does not allow you to transform your document. Elements must remain in the order they appear, and the settings on attributes are not displayed. There is no real decision-making power in CSS, so you cannot decide how something should appear based on information about that element or attribute (beyond what selectors can do). To address this issue, W3C came up with yet another pair of recommendations: Extensible Stylesheet Language (XSL) and XSL Transformations (XSLT). XSL is the general family of XML Styling vocabularies from the W3C, of which there are currently three: XSLT, XSL Formatting Objects (XSL-FO), and XML Path (XPath).

XSL Formatting Objects is an alternative to CSS that is used mainly in the printing business, and offers the same basic functionality. You can use XSL Formatting Objects if you would like, but since CSS is much simpler and more clearly documented, and also better supported by browsers, you would be better off using CSS than XSL-FO.

XPath is a method to select a given element or attribute in the XML hierarchy. The syntax can be very complicated, or very simple. The basic syntax will be covered in this chapter, but you can form very complicated expressions to select a specific node (instance of an element) in the document.

All of these standards are accessible from the W3C recommendation or on websites and books.

XSLT is a simple, reasonably easy to understand XML vocabulary for the transformation of documents from one format to another. You can use XSLT to convert from one XML vocabulary to a different XML vocabulary, or to HTML, text files, or virtually any text file format.

9.2 Structure

To begin, you have an XML document that needs to be transformed. We won't be using the coupon vocabulary; that one will be yours to transform. For the examples here, a more database-like vocabulary will be used, to demonstrate how well XSLT handles such a vocabulary.

<?xml version="1.0" encoding="iso-8859-1" standalone="no"?>
<?xml-stylesheet href="people.xsl" type="text/xsl"?>
<people>
 <list-name>Favorite Colors</list-name>
 <person>
  <name>
   <first>Bob</first>
   <last>Toddson</last>
  </name>
  <acct-no>327598</acct-no>
  <fav-color hex="#ff0000">Red</fav-color>
 </person>
 <person>
  <name>
   <first>Red</first>
   <last>McBlue</last>
  </name>
  <acct-no>209890</acct-no>
  <fav-color hex="#00ff00">Green</fav-color>
 </person>
 <person>
  <name>
   <first>Tammy</first>
   <last>Yu</last>
  </name>
  <acct-no>978541</acct-no>
  <fav-color hex="#7fff00">Chartreuse</fav-color>
 </person>
 <person>
  <name>
   <first>Phillip</first>
   <last>Cardwell</last>
  </name>
  <acct-no>258929</acct-no>
  <fav-color hex="#d2b48c">Tan</fav-color>
 </person>
</people>

Note the stylesheet tag; this time, the text/xsl MIME type is used. This MIME type is technically incorrect, because the IETF has to register a MIME type before it can be used, and no type is registered for XSL yet. However, text/xsl is the only MIME type recognized by both browsers, and will remain so until an official MIME type is registered.

How would one style this? It would certainly look unpleasant if you tried to style this with CSS. Instead, use XSL. To begin, start with the root element, which for XSL is stylesheet.

Much like XML Schema, you are expected to use the proper namespace for your XSLT document. For these examples I will use prefixes, because not only are they necessary, they can also help you find the XSL tags while writing your own stylesheet.

<?xml version="1.0" encoding="iso-8859-1" standalone="no"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

</xsl:stylesheet>

Also, every XSLT document needs to have an output method; XSLT can output HTML, XML, and text. For the first example, we will output HTML. (Use XML when outputting XHTML.)

<?xml version="1.0" encoding="iso-8859-1" standalone="no"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html" />

</xsl:stylesheet>

Next, you define templates, which are rules that are applied to elements. These are similar to rules in CSS, but remember that XSLT is used for transformation, not styling. You will often want to define a rule for the root (of the document, not the root element), which gives you the ability to add text to the beginning and end of the output. The pattern (which is a synonym of selector and means an XPath expression) for the root is /.

...
<xsl:output method="html" />

<xsl:template match="/">
 <html>
  <head>
   <title>People Report: <xsl:value-of select="/people/list-name" />
   </title>
  </head>
  <body>
   <center>
    <h1>People Report: <xsl:value-of select="/people/list-name" />
    </h1>
   </center>
   <table border="1" align="center">
    <tr>
     <td>Last Name</td>
     <td>First Name</td>
     <td>Account Number</td>
     <td>Favorite Color</td>
    </tr>
    <xsl:apply-templates select="/people/person" />
   </table>
  </body>
 </html>
</xsl:template>

...

Let's review the new XSLT elements before moving on to the template for the person element. The contents of a template are displayed in place of the element in the match pattern, so in this case, the contents are displayed at the root level, below any elements (even the root element).

The value-of element simply inserts the contents of a selected element. This is a more complicated pattern; this one is read from left to right as, "A list-name that is a child of a people that is a child of the root." In XPath, forward slashes are used to navigate through the XML tree structure, much like the file system on a hard disk.

The apply-templates element is sort of like a GOTO instruction for the XSLT processor; it will tell the processor to insert the specified elements here. Any elements that you do not include in an apply-templates rule will simply not appear. The behavior for XSLT is different from CSS, where you had to hide elements you did not want to appear. In this case, there is a table defined in the main body of the document, and in the template for each person element a table row will be inserted.

The template for the person element is then placed after the root template:

...
<xsl:template match="person">
 <tr>
  <td>
   <xsl:value-of select="name/last" />
  </td>
  <td>
   <xsl:value-of select="name/first" />
  </td>
  <td>
   <xsl:value-of select="acct-no" />
  </td>
  <td>
   <xsl:value-of select="fav-color" />
  </td>
 </tr>
</xsl:template>

</xsl:stylesheet>

Note that these elements do not reference the root. Instead, they are relative XPath expressions that are based on the context of their location. The context is the starting point for XPath expressions that do not reference the root. In this case, the context is the person element that is currently being looked at. If you view the current XML file in an XSLT-capable browser, you will now see this:

People Report: Favorite Colors


Last NameFirst NameAccount NumberFavorite Color
ToddsonBob327598Red
McBlueRed209890Green
YuTammy978541Chartreuse
CardwellPhillip258929Tan

XSL is already proving to be much more useful than CSS. We have transformed an XML file to an HTML document that is clear and easy to read. However, let's say that we want to make the background color of the table cells match the person's favorite color, using the hex attribute. Can XPath select the value of attributes? The answer is yes! But, you cannot embed one tag within another, so we need an alternative way to copy the value into a style attribute. This is where the variable element comes in.

...
  <td>
   <xsl:value-of select="acct-no" />
  </td>
  <xsl:variable name="hexcolor">
   <xsl:value-of select="fav-color/@hex" />
  </xsl:variable>
  <td style="background-color: {$hexcolor};">
   <xsl:value-of select="fav-color" />
  </td>
...

The variable element copies the contents, which in this case are the output from a value-of element, into the variable hexcolor. The variable is then pasted in wherever it sees the variable name preceded by a $ (and it must also be enclosed in curly braces {} when used in output text, as with the style attribute). Now the colors are easier to recognize:

People Report: Favorite Colors


Last NameFirst NameAccount NumberFavorite Color
ToddsonBob327598Red
McBlueRed209890Green
YuTammy978541Chartreuse
CardwellPhillip258929Tan

There is only one problem remaining. The names are not sorted in the XML file, and so they appear in the same order. You can make the names appear more organized by sorting them on the last name. To do this, use the handy sort element:

...
    <xsl:apply-templates select="/people/person">
     <xsl:sort select="name/last" />
     <xsl:sort select="name/first" />
    </xsl:apply-templates>
...

Note that sort is a child of apply-templates, which was before a self-closing tag. This alone is enough to sort by last name, and by first name if there are people with the same last name (you can modify the XML to test this). The select attribute is used to define what content to use as the key for sorting. You can be more specific, and have more complicated sorting commands. For example, when sorting numbers, specify data-type="number" to prevent 10 from coming after 1 and before 20. To sort in descending key order, specify order="descending". The defaults for those two are text and ascending.

Using XSLT to convert your XML document to HTML makes it easier to manage data that is presented on the web. However, not all browsers are XSL capable yet (although it's close). There are numerous server-side scripts that will process XSL for you on the server side so the visitor's browser is not required to do so.

9.3 Other XSL Applications

XSLT can be used for other file formats besides HTML. XML to XML conversion using XSLT is one of the most powerful uses of XSLT; this makes any XML vocabulary transformable into any other XML vocabulary. For example, say you have a vocabulary for a phone book, and a document looks like this:

<?xml version="1.0" encoding="iso-8859-1" standalone="no"?>
<phonebook>
 <name phone="3258908">Simmons, Mary</name>
 <name phone="2098359">Stimson, Greg</name>
</phonebook>

It is simple to convert our people vocabulary to this phonebook vocabulary, using a simple stylesheet. We'll say that numbers are optional and leave them off for now. Note the output method: it is now xml.

<?xml version="1.0" encoding="iso-8859-1" standalone="no"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" />

<xsl:template match="/">
 <phonebook>
  <xsl:apply-templates select="/people/person">
   <xsl:sort select="name/last" />
   <xsl:sort select="name/first" />
  </xsl:apply-templates>
 </phonebook>
</xsl:template>

<xsl:template match="person">
 <name>
  <xsl:value-of select="name/last" />, <xsl:value-of select="name/first" />
 </name>
</xsl:template>

</xsl:stylesheet>

The only problem with this is that by transforming from one XML document to another, you are overriding the default XSL stylesheet that your browser uses to pretty-print the source code of your XML document. As a result, your web browser will only display this:

Cardwell, PhillipMcBlue, RedToddson, BobYu, Tammy

The solution is to use an XSL preprocessor. The one I used in testing was a very nice one being developed in JavaScript using AJAX by Google, available here. The output isn't pretty when it comes out, but once you organize it, it looks like this:

<phonebook>
 <name>Cardwell, Phillip</name>
 <name>McBlue, Red</name>
 <name>Toddson, Bob</name>
 <name>Yu, Tammy</name>
</phonebook>

This can then be copied into a file (you will need to add your own XML declaration to the top) and loaded using a parser for the phonebook vocabulary. This is a simple example, but the possibilities are literally endless.

Likewise, you can convert the data in the people document into a comma-separated values (CSV) file for use in a spreadsheet. However, the tricky thing is, any spaces or newlines you use to indent your code will be interpreted as character data that is sent to output (in most places). To prevent formatting errors, you have to sacrifice a bit of readability in your XSL code. I will first show you the pretty version (note the text output method):

<?xml version="1.0" encoding="iso-8859-1" standalone="no"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" />

 <xsl:template match="/">
  Last Name,First Name,Account Number,Favorite Color<xsl:text>
</xsl:text><xsl:apply-templates select="/people/person">
   <xsl:sort select="name/last" />
   <xsl:sort select="name/first" />
  </xsl:apply-templates>
 </xsl:template>

 <xsl:template match="person">
  <xsl:value-of select="name/last" />,<xsl:value-of select="name/first" />,<xsl:value-of
select="acct-no" />,<xsl:value-of select="fav-color" /><xsl:text>
 </xsl:text>
</xsl:template>

</xsl:stylesheet>

With the indentation and extra lines, the output will be mangled and unreadable by spreadsheet programs:

  Last Name,First Name,Account Number,Favorite Color
  Cardwell,Phillip,258929,Tan
 McBlue,Red,209890,Green
 Toddson,Bob,327598,Red
 Yu,Tammy,978541,Chartreuse

Note the extra lines on top and bottom. This will not render correctly in a spreadsheet program. To correct the problem, remove all of the spaces and newline characters within templates EXCEPT the ones contained in xsl:text elements. The text element is an instruction to the XSL processor to preserve all character data contained within it; so in this case a new line will be preserved between every record in the spreadsheet. If you did not use text, you would find that all of your persons were combined into one long record. Once you modify your document to remove all indentation and newlines with that one exception, you will have a document that looks like this in the first screenful:

<?xml version="1.0" encoding="iso-8859-1" standalone="no"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transf...
<xsl:output method="text" />

<xsl:template match="/">Last Name,First Name,Account Number,Favorite Color...
</xsl:text><xsl:apply-templates select="/people/person"><xsl:sort select="...

<xsl:template match="person"><xsl:value-of select="name/last" />,<xsl:valu...
</xsl:text></xsl:template>

</xsl:stylesheet>

The output from this stylesheet is now correct:

Last Name,First Name,Account Number,Favorite Color
Cardwell,Phillip,258929,Tan
McBlue,Red,209890,Green
Toddson,Bob,327598,Red
Yu,Tammy,978541,Chartreuse

This data can be loaded into your favorite spreadsheet program now. Of course, it isn't nearly as easy to convert this data back into XML. This is another benefit of XML: While it may not be easy to convert other formats into different file types, XML can be easily converted into any type of file imaginable. It is even possible, though complicated, to create Acrobat PDF files from XML using XSL. Most of you will see the most benefit from HTML or XHTML.

9.4 Chapter Review & Exercises

You should now know how to create XSLT stylesheets to convert an XML document into HTML, XML, or text formats. You should know how to sort output and modify attributes in HTML tags using XML data. You should also know how to use basic XPath to select elements, children, and attributes.

  1. Transform the coupon XML vocabulary into HTML or XHTML using XSLT. Use CSS on the output as desired.

  2. Create an XSLT stylesheet for your computer lab XML vocabulary. Convert all the information into HTML or XHTML and convert whatever information makes sense to a spreadsheet into a CSV file.