Workshop: Reading RSS Syndication Feeds

There are hundreds of XML dialects out there representing data in a platform- independent, software-independent manner. One of the most popular is RSS, a format for sharing headlines and links from online news sites, weblogs, and other sources of information. RSS makes web content available in XML form, perfect for reading in software, in web-accessible files called feeds. There are RSS readers called news aggregators that have been adopted by several million information junkies to track all of their favorite websites. There also are web apps that collect and share RSS items. The hard-working Builder class in the nu.xom package can load XML over the Internet from any URL:

String rssUrl = "http://search.csmonitor.com/rss/top.rss";
Builder builder = new Builder();
Document doc = builder.build(rssUrl);


This hour's workshop employs this technique to read an RSS 2.0 file, presenting the 15 most recent items. Open your editor and enter the text of Listing 21.4. Save the result as Aggregator.java.

Listing 21.4. The Full Text of Aggregator.java
 1: import java.io.*;
 2: import nu.xom.*;
 3:
 4: public class Aggregator {
 5: public String[] title = new String[15];
 6: public String[] link = new String[15];
 7: public int count = 0;
 8:
 9: public Aggregator(String rssUrl) {
10: try {
11: // retrieve the XML document
12: Builder builder = new Builder();
13: Document doc = builder.build(rssUrl);
14: // retrieve the document's root element
15: Element root = doc.getRootElement();
16: // retrieve the root's channel element
17: Element channel = root.getFirstChildElement("channel");
18: // retrieve the item elements in the channel
19: if (channel != null) {
20: Elements items = channel.getChildElements("item");
21: for (int current = 0; current < items.size(); current++) {
22: if (count > 15) {
23: break;
24: }
25: // retrieve the current item
26: Element item = items.get(current);
27: Element titleElement = item.getFirstChildElement("title");
28: Element linkElement = item.getFirstChildElement("link");
29: title[current] = titleElement.getValue();
30: link[current] = linkElement.getValue();
31: count++;
32: }
33: }
34: } catch (ParsingException exception) {
35: System.out.println("XML error: " + exception.getMessage());
36: exception.printStackTrace();
37: } catch (IOException ioException) {
38: System.out.println("IO error: " + ioException.getMessage());
39: ioException.printStackTrace();
40: }
41: }
42:
43: public void listItems() {
44: for (int i = 0; i < 15; i++) {
45: if (title[i] != null) {
46: System.out.println("\n" + title[i]);
47: System.out.println(link[i]);
48: i++;
49: }
50: }
51: }
52:
53: public static void main(String[] arguments) {
54: if (arguments.length > 0) {
55: Aggregator aggie = new Aggregator(arguments[0]);
56: aggie.listItems();
57: } else {
58: System.out.println("Usage: java Aggregator rssUrl");
59: }
60: }
61: }


After you compile the app successfully, it can be run with any RSS 2.0 feed. Here's a command to try it with the Top Stories feed from the Christian Science Monitor newspaper:

java Aggregator http://search.csmonitor.com/rss/top.rss


Sample output from the feed follows:

As Britain copes, a massive hunt for London bombers http://www.csmonitor.com/2005/0711/p07s01-woeu.html The new Al Qaeda: local franchises http://www.csmonitor.com/2005/0711/p01s01-woeu.html Tough job: Can anyone govern California?
http://www.csmonitor.com/2005/0711/p02s01-uspo.html


By the way

You can find out more about the RSS 2.0 XML dialect from the RSS Advisory Board website at http://blogs.law.harvard.edu/tech. The author of this tutorial is a member of the board, which offers guidance on the format and a directory of software that can be used to read RSS feeds. There also are two other formats with similar functionality and appeal: RSS 1.0 and Atom.


      
Comments