Taking a Tour of ROME Blog

Version 2



    The Main Streets of ROME
    Be Cautious of Uncertain Turns
    Making Your Own Routes
    Let Us Not Circle Over the Same Paths
    No Need to Revisit Known Routes
    More Roads to be Explored

    On the java.net project page for ROME, the famous line from Ambrose Bierce is quoted:

    All roads, howsoe'er they diverge, lead to Rome

    In this case, it's all feeds that may be reached by ROME. The ROME in question is a Java library that provides a single interface to web syndication feeds while abstracting the differences between RSSand Atom. ROME version 0.8 contains many bug fixes and support for Atom 1.0. With it you can read, create, merge, filter, and otherwise mash up your favorite syndicated streams.

    ROME uses the JDom library for parsing the XML and building the objects that the developer uses. JDom, in turn, can use the XML parser it has built into it or use others that conform to the JAXP specification. For the sample code in this article, JDom was set up to use the Xerces parser from the Apache project (distributed with JDom).

    The Main Streets of ROME

    To begin with, let's look at a fairly simple case of using the classes and methods from ROME. The code in FeedReader.java(all code is downloadable from the link in Resources, below) shows a simple Java class that reads a feed from a URL on the command line. It parses the feed and presents some simple information about it: title, author, description, publication date, copyright, etc. It lists the URIs of any syndication modules (such as Dublin Core) that the feed uses, and then the titles of the entries (or articles, if you prefer). Lastly, it shows the URL of the image that the feed references, if one is given.

    We start with a quick look at the imports section, just the imports provided by ROME itself. These five come from three of the six namespaces that ROME uses:

    Provides the parent class for the RSS and Atom beans.
    Provides the implementation classes for the core elements of Atom feeds.
    Provides the beans for handling syndication modules. The example code uses the Module interface from this namespace.
    As with the Atom-related namespace above, this provides the implementation classes for RSS feeds.
    This namespace holds most of the bean classes that an application will actually use. The interfaces and concrete classes that provide access to feeds while abstracting their details are here. The sample code uses SyndFeed andSyndEntryImpl from this namespace.
    Lastly, this namespace provides input and output classes for reading and parsing the feeds themselves, prior to their instantiation as classes from the previous namespaces. The sample code uses the XmlReader class from here, which handles the character-set issues in the XML reading/parsing process. It also uses the SyndFeedInput class, which will use theXmlReader to actually pull in the contents of the feed.

    The heart of the program starts around line 21:

    final URL feedUrl = new URL(args[0]); final SyndFeedInput input = new SyndFeedInput(); final SyndFeed feed = input.build(new XmlReader(feedUrl));

    An object of the SyndFeedInput class is built to read and parse a feed, and it in turn uses an on-the-fly instance of the XmlReader class to provide the input stream. The XmlReader class is very handy here, since it tries to handle all character-encoding issues for you. TheSyndFeedInput object handles creating feed objects from input sources (like the XmlReader).SyndFeed is an interface to all of the types of feeds ROME provides support for. With a SyndFeed handle, you can treat all feeds identically.

    The next lines use accessor methods of the SyndFeedhandle to pick out interesting parts of the feed for display to the screen:

    System.out.println("Title: " + feed.getTitle()); System.out.println("Author: " + feed.getAuthor()); // and so on...

    Most of the classes that an application will use follow the Java Bean pattern, with member data accessible via getter and setter methods. Members like the syndication modules and the feed entries themselves return objects that can be managed via theList interface. This is done in the forloops, for the syndication modules and entries.

    for (final Iterator iter = feed.getModules().iterator(); iter.hasNext();) { System.out.println("\t" + ((Module)iter.next()).getUri()); }

    The Module interface works for extension modules as the SyndFeed does for feeds, exposing data via bean-style "get" methods. The URI of a syndication module uniquely identifies it, so that is what the sample program displays. The fact that all components are managed as beans make them easy to use and exchange; for example, in a server-side aggregator.

    Be Cautious of Uncertain Turns

    There are a few different types of exceptions that may get thrown in the process of reading and parsing a feed. The sample code takes a short-cut approach by just putting the whole block of main logic in a try-catch construct. TheURL class may complain if the input is badly formed, and the XmlReader and SyndFeedInputclasses have their error cases, as well. Some of these are lower-level I/O exceptions that get propagated upwards.

    Depending on how you are using the ROME libraries, you may want to have finer control over the exception handling. Or you can follow the example here for a simpler approach.

    Making Your Own Routes

    ROME is by no means limited to just reading feeds. It provides beans for creating them, as well. To demonstrate this, we'll partially recreate the "inbox" functionality of the del.icio.us social bookmarking service.

    The inbox feature allows a user to choose several feeds--from individual tags such as "java" or from other users such as "rjray"--and combine them into a single feed. It effectively merges all of the separate RSS channels into one, which is itself made available as RSS. The code in DeliciousMerger.javareproduces this, with a difference: it outputs all of the entries, not just the 30 most-recent ones.

    Diving straight into the code:

    final StringBuffer tagList = new StringBuffer(args[0]); for (int argidx = 1; argidx < args.length; argidx++) { tagList.append(", "); tagList.append(args[argidx]); } newFeed.setTitle("Combined del.icio.us Tags: " + tagList); newFeed.setDescription("Aggregation of tags: " + tagList); newFeed.setFeedType("rss_1.0"); newFeed.setAuthor("DeliciousMerger"); newFeed.setLink("http://del.icio.us");

    The first six lines here create a string that combines all of the tags passed on the command line, for use in the description of the new feed. And the next five start the creation of the new feed by setting the title, author, description, base URL, and feed type. The type is worth looking at more closely: at present, ROME can produce feeds in several flavors of RSS, as well as Atom 0.3 and Atom 1.0. The type-strings for the options are:

    • rss_0.9
    • rss_0.91
    • rss_0.92
    • rss_0.93
    • rss_0.94
    • rss_1.0
    • rss_2.0
    • atom_0.3
    • atom_1.0

    In the example code, a RSS 1.0 feed is being created.

    The following lines are all that are needed to collect the entries from each feed (per the tags given on the command line) into a single list:

    feedUrl = new URL(urlBase + args[idx]); feed = input.build(new XmlReader(feedUrl)); entries.addAll(feed.getEntries());

    The first two lines are virtually identical to the first example, while the third line takes advantage of theList-style return value of getEntries to simplify collecting entries.

    Further down, the code does a little shuffling around to convert the contents of the ArrayList into an ordinary array of SyndEntry objects:

    SyndEntry[] entriesArray = new SyndEntry[entries.size()]; entriesArray = (SyndEntry[])entries.toArray(entriesArray); Arrays.sort(entriesArray, merger.new OrderByDate());

    First, the toArray method from theList interface is used to get the array representation (the odd calling syntax lets toArray know how to properly cast the elements of the array it is creating).

    The sorting itself uses a small inner class (defined further down) that implements the Comparator interface, in order to sort the array of entries by their dates. This turns the list from several segments that were sorted individually into one single list sorted completely. The built-in comparison logic of theDate class does the real work of sorting for us.

    The next line after sorting simply sets the entries for the new feed by calling setEntries with the array of entries cast back into a List. Going from the list to the array and back was just for the sake of sorting.

    After all of this, writing the new feed is almost anti-climactic. It's almost easier than SyndFeedInputwas, since it is being sent to the console:

    output.output(newFeed, new PrintWriter(System.out));

    With newFeed capable of turning itself into XML, all the SyndFeedOutput object needs is an output stream to send it to.

    Let Us Not Circle Over the Same Paths

    The DeliciousMerger class combines everything like the del.icio.us inbox feature does, but it also repeats elements as the inbox does. And since del.icio.us is social in nature, when a link pops up in a feed, it is often linked by others using the same tag, causing it to reappear. Let's fix that.

    The code in DeliciousMerger2.java is based very closely on DeliciousMerger.java. Where it differs is in thefor loop about halfway down the code:

    for (final Iterator iter = feed.getEntries().listIterator(); iter.hasNext(); ) { final SyndEntry entry = (SyndEntry)iter.next(); if (! seenUrls.containsKey(entry.getLink())) { entries.add(entry); seenUrls.put(entry.getLink(), entry); } }

    Here, rather than adding all the entries blindly, we keep aHashMap object that we use to keep track of each URL as it is seen. If a URL is already present in the map, then it doesn't get added to the new list a second (or third) time. Those lines (plus the declaration of seenUrls and the extra imports) are the only differences between the two.

    No Need to Revisit Known Routes

    Because ROME needs the JDom package anyway, you have it at your fingertips, available should you need to parse any XML of your own. Because of this, it is almost as easy to filter out URLs that you already have saved, as it was to eliminate duplicates. You can fetch your full set of bookmarks from del.icio.us and use them to pre-populate the seenUrls map.

    DeliciousMerger3.java does this by adding a static method called readDelBookmarkFile, which is added towards the bottom (before the private inner class we use for sorting). Since this article isn't about JDom, we'll go lightly on this part. The command-line argument list now expects the first argument to be the name of the file to which you saved your complete bookmark list. The parsing of this file is very simplistic, and only looks for the bare minimum tag and attribute sets needed to get the data we want. Since JDom gives us the matching child elements (we want the ones named post) in a handy List, sticking them in the table is as easy as looping over the feed entries elsewhere:

    for (final Iterator iter = children.iterator(); iter.hasNext(); ) { element = (Element)iter.next(); key = element.getAttributeValue("href"); if ((key != null) && (key.length() > 0)) { marks.put(key, key); } }

    Because we're parsing with JDom, reading the file may throw aJDomException, not just IOException. Thecatch block checks for this. The creation of the string that combines the tags into a comma-separated list starts at argument 2 rather than 1, since the first argument is now the bookmarks filename.

    More Roads to be Explored

    This should give you a good start on using ROME. It doesn't end here; ROME has even more features, such as creating feeds, injecting module information, etc. The examples get you going, and hopefully provide room to experiment and expand. You could implement command-line options to control the number of elements produced by the merger classes, or sub-class the beans to allow extra annotations (what the source was, or information for CSS/XHTML rendering).

    ROME also has a plugin model. The ROME project's Wiki pageprovides links to some current plugin projects. These provide support for RSS modules such as site content, iTunes podcasting extensions, and Creative Commons license information, and can be used as examples for writing your own.

    But that is a road for another day.