NatureServe Web Services

Code Sample 2 - Get Data and Reformat (with XSLT crash course)

This code sample calls a web service and reformats the results into an HTML page using an XSLT transformation. XSLT is a styling language that is used to transform XML into other text based formats. Mostly it is used to turn XML into HTML.

Note: If you are new to Java, we recommend starting your reading at Code Sample 1, which includes a crash course in Java. Code Sample 2 assumes a basic understanding of Java syntax.

Specification

This program will call the Species Images Service and create an HTML page showing the available images for Golden Eagle as thumbnails. Clicking on a thumbnail will take the user to the full version of the image.

Program Code

Download: Sample2.java and Sample2.xsl
[Right-click and choose either 'Save Target As...' (IE) or 'Save Link As...' (Firefox)]

import java.io.FileInputStream;
import java.io.InputStream;
import java.net.URL;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;

public class Sample2 {

    public static void main(String[] args) {

        try {
            String url =
                    "https://services.natureserve.org/" +
                    "idd/rest/ns/v1/globalSpecies/images";
            String uid = "ELEMENT_GLOBAL.2.100925";
            String resolution = "thumbnail";

            String request = url;
            request = request + "?";
            request = request + "uid=" + uid;
            request = request + "&";
            request = request + "resolution=" + resolution;

            URL serviceURL = new URL(request);
            InputStream is = serviceURL.openStream();
            TransformerFactory tFactory = TransformerFactory.newInstance();
            Transformer transformer =
                    tFactory.newTransformer(
                    new StreamSource(new FileInputStream("Sample2.xsl")));
            transformer.transform(new StreamSource(is),
                    new StreamResult(System.out));
        } catch (Exception e) {
            System.out.println(e);
        }

    }
}
					

Download the files and run the code now, to see it working. For help on this, see Running the samples.

In order to browse the resulting HTML, the output can be redirected to a file, rather than just displayed on screen, as follows:

java Sample2 > sample2.html

sample2.html can then be opened in a browser

Code Breakdown
...
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;
					

javax.xml.transform is the package to check for classes that relate to XSLT transformations.

            String url =
                    "https://services.natureserve.org/" +
                    "idd/rest/ns/v1/globalSpecies/images";
            String uid = "ELEMENT_GLOBAL.2.100925";
            String resolution = "thumbnail";
					

The service URL is lifted straight from the Species Image Service description page, "ELEMENT_GLOBAL.2.100925" is the UID for Golden Eagle and we just want the thumbnail resolution images.

            TransformerFactory tFactory = TransformerFactory.newInstance();
            Transformer transformer =
                    tFactory.newTransformer(
                    new StreamSource(new FileInputStream("Sample2.xsl")));
            transformer.transform(new StreamSource(is),
                    new StreamResult(System.out));
					

XSLT works by 'transforming' an input into an output, based on a stylesheet. A TransformerFactory lets us obtain new Transformer objects. Each Transformer object performs the transformation specified by one stylesheet. Note that transformers do not work directly with streams. Instead, Sources and Results are used, though these can safely be thought of simply as wrappers for streams.

The code above creates the transformer transformer, which will perform transformations using the stylesheet found in the file Sample2.xsl (explained below). The call to transformer.transform() transforms the data from the input stream is (the service results) and outputs the results to the System output stream (often the screen).

XSLT Stylesheet Code
<?xml version="1.0" encoding="UTF-8"?>

<xsl:stylesheet
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:ns=
       "http://services.natureserve.org/docs/schemas/biodiversityDataFlow/1"
  xmlns:dc="http://purl.org/dc/terms/"
  version="1.0">
        
    <xsl:output method="html"/>
	
    <xsl:template match="/ns:images">
        <html>
            <head>
                <title>Golden Eagle Images report</title>
            </head>
            <body>
                <h1>Golden Eagle Images Report</h1>
                <h2>You should pay proper heed to all required copyright
                       and permissions details when processing images.
                       This report is for sample code purposes only and
                       does a minimal job of addressing these concerns.
                </h2>
                <xsl:apply-templates select="ns:image" />
            </body>
        </html>
    </xsl:template>

    <xsl:template match="/ns:images/ns:image">
        <p>
            <a href="{normalize-space(dc:isVersionOf)}">
              <img src="{normalize-space(dc:identifier)}" title="{dc:title}"/>
            </a><br/>
            <xsl:value-of select="dc:title"/><br />
            <xsl:if test="dc:rightsHolder">
                <xsl:text>&#xA9; </xsl:text>
                <xsl:for-each select="dc:rightsHolder[position() != last()]">
                    <xsl:value-of select="."/>
                    <xsl:if test="position() != last()">
                        <xsl:text>, </xsl:text>
                    </xsl:if>
                </xsl:for-each>
                <br />
            </xsl:if>
            <xsl:value-of select="dc:rights"/>
        </p>
    </xsl:template>

</xsl:stylesheet>
					
Code Breakdown

It might be helpful to open the stylesheet in another window when reviewing the code.

<?xml version="1.0" encoding="UTF-8"?>
					

XSLT is written in XML. This line is the standard declaration that this is an XML file.

<xsl:stylesheet
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:ns=
        "http://services.natureserve.org/docs/schemas/biodiversityDataFlow/1"
  xmlns:dc="http://purl.org/dc/terms/"
  version="1.0">
...
</xsl:stylesheet>
					

An XSLT stylesheet consists of one xsl:stylesheet element. Within this element, the stylesheet transformations are defined. The stylesheet references various 'namespaces' and assigns each an abbreviation:

  • xmlns:xsl="..." - This is the namespace for XSLT itself. This command tells the transformer that any XML element (or attribute) that follows with a name prefixed with xsl is to be treated as a stylesheet instruction.
  • xmlns:ns="..." - This is the namespace for NatureServe's data. Anything that follows with a name prefixed with ns is defined by NatureServe.
  • xmlns:dc="..." - This is the namespace for the Dublin Core Metadata Initiative (DCMI). The DCMI has defined a set of internationally recognised standard elements for encoding metadata. The Species Images Web Service makes use of these in its output, using the prefix dc.

A note on namespaces: The prefix assigned to a namespace is arbitrary, but governed by convention. As a first step, the transformer finds any references to the assigned prefixes and replaces them with the full namespace value, so xsl:stylesheet becomes http://www.w3.org/1999/XSL/Transform:stylesheet. We could have assigned xyz to the XSLT namespace and everything would have worked exactly the same. If we had played with the namespace itself, on the other hand, the transformer would not have recognised any of our instructions as instructions.

    <xsl:output method="html"/>
					

We tell the transformer that we are going to generate HTML as the output format. We could also have specified xml or text.

    <xsl:template match="/ns:images">
        <html>
            <head>
                <title>Golden Eagle Images report</title>
            </head>
            <body>
                <h1>Golden Eagle Images Report</h1>
                <h2>You should pay proper heed to all required copyright
                       and permissions details when processing images.
                       This report is for sample code purposes only and
                       does a minimal job of addressing these concerns.
                </h2>
                <xsl:apply-templates select="ns:image" />
            </body>
        </html>
    </xsl:template>
					

A stylesheet consists of a set of 'templates' that act on the XML data. The transformer starts processing the document from the root element downwards. At each element, it checks to see whether there are any templates defined for that element. If so, that template's instructions are performed.

In this case, we see that this template matches on "/ns:images". This is the root element of the XML, so this template will get run once. Elements are referenced by an XPath expression, which (very roughly) is a little like a filename with its folders specified. /ns:images refers to 'the element called ns:images at the root of the XML'.

A standard HTML page is embedded in the template. Most of the contents of the template tag will be output verbatim, since they are not part of the XSLT namespace.

The xsl:apply-templates select="ns:image", being IN the XSLT namespace, will be treated as an instruction. In this case, the processor is being told to stop running down the XML sequentially and instead to find (select) any ns:image elements 'in context' (in this case, within the "/ns:images" element - think of working in a particular folder, where you do not have to specify the folder name when working on a file). For each ns:image element found, apply any template matches to it.

    <xsl:template match="/ns:images/ns:image">
        <p>
            <a href="{normalize-space(dc:isVersionOf)}">
              <img src="{normalize-space(dc:identifier)}" title="{dc:title}"/>
            </a><br/>
            <xsl:value-of select="dc:title"/><br />
            <xsl:if test="dc:rightsHolder">
                <xsl:text>&#xA9; </xsl:text>
                <xsl:for-each select="dc:rightsHolder[position() != last()]">
                    <xsl:value-of select="."/>
                    <xsl:if test="position() != last()">
                        <xsl:text>, </xsl:text>
                    </xsl:if>
                </xsl:for-each>
                <br />
            </xsl:if>
            <xsl:value-of select="dc:rights"/>
        </p>
    </xsl:template>
					

This template matches on any /ns:images/ns:image element, so the previous template's apply-template will cause this to be run once for each image.

Each image will be displayed within a <p> paragraph, which contains the thumbnail image acting as a link to the larger image, followed by the image title, copyright information and permissions.

The link is constructed as a standard <a> tag. The curly braces ({}) mark an XSLT expression. In the case of href, the expression normalize-space(dc:isVersionOf) means 'the contents of the dc:isVersionOf element, with all leading and trailing whitespace removed'. dc:isVersionOf holds the URL for the 'other' version of the image. Since the service is giving us the thumbnail, the other version is the full size image.

The derivation of img src is similar, where dc:identifier contains the URL of the image itself, i.e. the thumbnail. The derivation of img title should now be clear.

The title of the image is placed on the page using the xsl:value-of select="dc:title". value-of, as the name suggests, outputs the value of the expression given, in this case, the contents of the dc:title element.

Since there may be no rights holder for this image, if it is Public Domain, we check whether this information is available using the xsl:if test="dc:rightsHolder" element. The contents of the if will only be accessed if the test is not false. If there is a rights holder element present, we will process the if.

The xsl:text element demonstrates how to insert text into your output which is not in the form of tag contents. Notice that all the other HTML text output has been wrapped in tags, tags which are properly closed. XSLT will complain if you try to output text which is not properly formed, outside of using a text instruction.

The value of the text, &#xA9; appears a little odd. This is actually a way of specifying the character ©. We want to place the HTML &copy; shortcut (entity) for this character into our output, but XSLT doesn't define this entity. XSLT does have entities of the form &whatever;, however. This means that if we try to insert the text '&copy;' into our output, XSLT will indicate an error. Since we have specified HTML as the output format, if we can insert the literal © character into our output, the transformer will find and replace this with the proper &copy; entity.

When working with HTML entities it is often best to use the numeric code, which can be found on the Unicode code charts, and avoid much confusion. Most of the commonly used characters fall in the 'Basic Latin' or 'Latin-1' charts. While it is initially tempting to try and output combinations such as &amp;copy;, this will end in failure.

The xsl:for-each instruction creates a loop over a set of elements - the set specified in its select expression. The expression here demonstrates a feature of XPath - 'predicates'. The expression dc:rightsHolder[position() != last()] says 'all of the dc:rightsHolder elements, up to, but not including, the last one in document order'. Thus we are creating a loop which will process all but the last rights holder. Why ignore the last? Because the service description tells us that this always contains a description of how the rights are held ('Copyright Held By Creator', etc), rather than a rights holder name.

The xsl:value-of select="." outputs the contents of the current (.) element in context. The context of a for-each is the element it is currently processing, so the 'value-of' the current dc:rightsHolder is output.

Finally, the if decides whether a comma should be added after this rights holder name, by deciding whether this one is the last in the list. NOTE that the last() here is not the same as in the for-each predicate - this last() is calculated as being the last one of the set that the for-each predicate creates.

See also

The Technical Library has a list of XSLT learning materials. One absolutely essential resource when stumped by anything XSLT is Dave Pawson's XSL Frequently Asked Questions page. There is usually no need to look any further.