What Is RSS?
RSS is an app of XML that defines a sequenced list of content. RSS calls this list a channel. Within the channel are one or more items. These items are usually located at a URL. The feed also contains metadata about the channel and each item; the feed can specify an image to be used as the logo of the channel, a description of each item, and so on. RSS was originally created by Netscape for use on its My Netscape portal. Users needed to be able to add channels of content to their portals, and Netscape wanted a consistent way to represent those channels. Thus, the first version of RSS was born as Version 0.9 in March of 1999. In this initial specification, the letters RSS stood for RDF Site Summary. Since then, RSS has been used as an acronym for two additional terms:
- Rich Site Summary
- Really Simple Syndication
In addition, some people involved with the development of RSS now claim that it is not an acronym at all.
Blogs and PodcastingTwo popular uses for RSS are blogs and podcasting. A blog is generally a web site containing a series of entries. Although it's not strictly required to have a site be called a blog, the majority of blogs have an RSS feed available and many blogging apps are built on RSS. Podcasting, on the other hand, is very much tied to RSS. Podcasting is the distribution of multimedia content through a syndication feed, generally using the enclosure element within RSS 2.0. A podcatcher (an app that reads podcast feeds) downloads the media files referenced by a podcast feed. Most podcatchers are designed to put downloaded files in a specific location on a user's computer from which they will be copied to a portable audio or video player. |
RSS Variants
Nine different specifications have been released under the name RSS. These can be separated into those that are based on the Resource Description Framework (RDF) and those that aren't, as seen in Table 12-1.
Table 12-1. RSS variants
Based on RDF | Not based on RDF |
---|---|
RSS 0.9 | RSS 0.91 (both Netscape and Userland versions) |
RSS 1.0 | RSS 0.92 through RSS 0.94 |
RSS 2.0 |
This chapter will focus on RSS 1.0 and RSS 2.0, the two current versions. For comparison, Examples 12-1 and 12-2 contain excerpts from RSS 1.0 and 2.0 feeds respectively.
|
Example Example RSS 1.0 feed
<?xml version="1.0" encoding="UTF-8"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://purl.org/rss/1.0/"> <channel> <title>Example RSS 1.0 Feed</title> <link>http://www.example.org</link> <description>the Example Organization web site</description> <image rdf:about="http://www.example.org/images/logo.gif"> <title>Example</title> <url>http://www.example.org/images/logo.gif</url> <link>http://www.example.org</link> </image> <items> <rdf:Seq> <rdf:li resource="http://www.example.org/item1/"/> <rdf:li resource="http://www.example.org/item2/"/> </rdf:Seq> </items> </channel> <item rdf:about="http://www.example.org/item1/"> <title>New Status Updates</title> <link>http://www.example.org/item1/</link> <description>News about the Example project</description> </item> <item rdf:about="http://www.example.org/item2/"> <title>Another New Status Updates</title> <link>http://www.example.org/item2/</link> <description>More news about the Example project</description> </item> <textinput rdf:about="http://www.example.org/search/"> <title>Search example.org</title> <description>Search the website www.example.org</description> <name>searchterm</name> <link>http://www.example.org/search/</link> </textinput> </rdf:RDF> |
Example Example RSS 2.0 feed
<?xml version="1.0" encoding="UTF-8"?> <rss version="2.0"> <channel> <title>Example RSS 2.0 Feed</title> <link>http://www.example.org</link> <description>The Example Organization web site</description> <image> <title>Example</title> <url>http://www.example.org/images/logo.gif</url> <link>http://www.example.org</link> </image> <textInput> <title>Search this site:</title> <description>Find:</description> <name>q</name> <link>http://example.com/search</link> </textInput> <item> <title>New Status Updates</title> <link>http://www.example.org/item1/</link> <guid isPermaLink="true">http://www.example.org/item1/</guid> <description>News about the Example project</description> </item> <item> <title>Another New Status Updates</title> <link>http://www.example.org/item2/</link> <guid isPermaLink="true">http://www.example.org/item2/</guid> <description>More news about the Example project</description> </item> </channel> </rss> |
As you can see from these examples, although the vocabularychannel, item, and so onis the same, these documents have important syntactical differences. Most significantly, the root elements and namespaces are different. In RSS 1.0, the root element is named RDF in the namespace http://www.w3.org/1999/02/22-rdf-syntax-ns# and the RSS 1.0 elements are in the namespace http://purl.org/rss/1.0. In RSS 2.0, the root element is named rss; it and all the other RSS 2.0 elements are in no namespace. In RSS 2.0, the description element can contain HTML markup. The HTML elements must be either XML-escaped or within a CDATA block. The descriptions in Example 12-2 could be enhanced with:
<description>News about the <b>Example</b> project</description>
Or with CDATA:
<description> <![CDATA[<i>More</i> news about the <b>Example</b> project]]> </description>
In addition to the elements shows in Example 12-2, RSS 2.0 has many optional elements available at both the channel and item levels. Table 12-2 lists these additional elements.
Table 12-2. Additional RSS 2.0 elements
channel subelements | item subelements |
---|---|
language |
author |
copyright |
category |
managingEditor |
comments |
webmaster |
enclosure |
pubDate |
pubDate |
lastBuildDate |
source |
generator | |
docs | |
cloud | |
ttl | |
rating | |
skipHours | |
skipDays |
Some of these will be examined later in this chapter. For full definitions of all of the RSS 2.0 elements, please refer to the specification at http://www.rssboard.org/rss-specification.
What's RDF?I mentioned the Resource Definition Framework (RDF) a few times before. RDF is a set of World Wide Web Consortium (W3C) specifications for expressing various properties to describe a resource. RDF expressions are composed of three values: a subject, a predicate, and an object. If the expression "The shape of the ball is round" were expressed in RDF, the subject would be "the ball," the predicate would be "shape," and the object is "round." Example 12-1 contains pretty much all the RDF you'll need to know to work with RSS 1.0 documents: the Seq and li elements defining a list and the about attribute to associate an item with the identifier set in the resource attribute of each li element. For more information on RDF, Practical RDF by Shelly Powers (Oracle) is highly recommended. |
RSS Modules
Both RSS 1.0 and 2.0 are extensible through the use of RSS modules. An RSS module is simply a set of elements in a namespace other than the namespace of the host RSS document. RSS modules are widely used in both RSS 1.0 and 2.0 documents. Although some modules are specified for a particular version of RSS, most will work with either. The RSS 1.0 specification defines three modules: Dublin Core, Syndication, and Content.
Dublin Core
The Dublin Core Metadata Initiative (DCMI) is an organization dedicated to creating standardized metadata vocabularies. Dublin Core allows metadata to be expressed using the same terms in a variety of formats. Dublin Core elements are commonly seen in HTML/XHTML, RDF (including RSS 1.0), and XML (including RSS 2.0) documents. More information about Dublin Core can be found at http://purl.org/dc. Dublin Core Simple, the basic set of Dublin Core metadata, contains 15 elements in the namespace http://purl.org/dc/elements/1.1:
- title
- creator
- subject
- description
- uploader
- contributor
- date
- type
- format
- identifier
- source
- language
- relation
- coverage
- rights
As you can see, some of these elements can be used to bring some of the extra elements from RSS 2.0 into RSS 1.0: Dublin Core elements date, language, and rights can hold the same data as the RSS 2.0 elements pubDate, language, and .
|
Syndication
The RSS 1.0 Syndication module in the namespace http://purl.org/rss/1.0/modules/syndication adds elements describing how often the feed is updated. It defines elements updatePeriod and updateFrequency that let you define pretty much any consistent update schedule. For example, to declare that a feed is updated twice hourly, you could add the following to your feed:
<sy:updatePeriod>hourly</sy:updatePeriod> <sy:updateFrequency>2</sy:updateFrequency>
This same schedule could be expressed with the RSS 2.0 ttl element:
<ttl>30</ttl>
Content
The RSS 1.0 Content module in the namespace http://purl.org/rss/1.0/modules/content enables the embedding of HTML content as an RSS 1.0 item's description. A formatted version of the description in Example 12-1 could be included as:
<content:encoded><![CDATA[<i>More</i> news about the Example project]]> </content:encoded>
Embedding HTML in RSS 1.0 with the Content module is actually superior to embedding HTML in RSS 2.0, because there's no way to indicate whether the content of an RSS 2.0 description element should be treated as HTML. As a result, there is no way to distinguish between these two descriptions:
- More news about Example project
- <i>More</i> news about Example project
This may not seem like a problem, but if you have a feed about HTML markup, it can be important.
CommentAPI
The CommentAPI defines an interface for blogs to accept comments without requiring the user to fill out a form on a web site. Instead, comments can be accepted directly from an RSS aggregator. Comments are posted as RSS 2.0 items. In order to discover the URL to post the comment XML to, the CommentAPI module using the namespace http://wellformedweb.org/CommentAPI defines a comment element to contain the URL. More information about the CommentAPI module is available at that URL.
iTunes
When Apple Computer's iTunes Music Store added support for podcasting in mid-2005, it introduced an RSS module that added additional elements to RSS 2.0 to support the podcast directory within the iTunes Music Store. This module uses the namespace http://www.itunes.com/dtds/podcast-1.0.dtd and is fully documented at http://www.apple.com/itunes/podcasts/techspecs.html. To have your podcast listed within the iTunes podcast directory, it must use this module.
Atom
The Atom Syndication Format was created in an attempt to merge the simplicity of RSS 2.0 (like RSS, Atom doesn't use RDF) with the more structured aspects of RSS 1.0 (for one thing, all Atom elements are within a namespace). The feed in Examples 12-1 and 12-2 can be written in Atom as Example 12-3.
Example Example Atom feed
<?xml version="1.0" encoding="utf-8"?> <feed xmlns="http://www.w3.org/2005/Atom"> <title>Example Atom Feed</title> <subtitle>The Example Organization web site</subtitle> <link href="http://web.archive.org/web/www.example.org/"/> <id>urn:uuid:68063c50-1f77-11db-a98b-0800200c9a66</id> <entry> <title>New Status Updates</title> <link href="http://web.archive.org/web/www.example.org/item1/"/> <id>urn:uuid:68063c51-1f77-11db-a98b-0800200c9a66</id> <summary>News about the Example project</summary> </entry> <entry> <title>More New Status Updates</title> <link href="http://web.archive.org/web/www.example.org/item2/"/> <id>urn:uuid:975ceb20-1f77-11db-a98b-0800200c9a66</id> <summary type="html"><![CDATA[<i>More</i> news about the Example project]]></summary> </entry> </feed> |
In addition, there is a related Atom Publishing Protocol that defines a standard API for creating and editing entries on a blog. Both Atom specifications are developed by the AtomPub Working Group, part of the Internet Engineering Task Force (IETF). Although Atom is an interesting set of technologies, we will not be looking extensively at Atom here. For more details on Atom, see the Working Group's web site: http://www.ietf.org/html.charters/atompub-charter.html.