Combining RSS feeds and Displaying them on Your Page with Javascript and PHP

Last night I was trying to do something that I thought would be pretty simple: display a bunch of recent weblog posts on one page.

There is a great online community of folks in the biofuels blogosphere, and this page would give a quick summary of their myriad, nerdy, wonderful events and research.

So the goal is to have the title of a weblog, followed by the most recent posts, each with the date posted and a bit of the post body. The entire web page might be called "biofuels digest," with a total of perhaps 30 weblogs. Often on the web you will see "blogrolls" that list lots of blogs, but these are usually just links to the blogs (there isn't a post excerpt) and they are almost always either hardcoded html or javascript-included from a third party like Bloglines (see my own blogroll on the front page).

I'd had experience with building this type of page last year, when I just wanted to have an "aggregator" page of all my most loved online reading. I ended up just slapping things around with Magpie RSS (an excellent open source PHP class), and it worked fine. Not slick, but fine.

I could have easily used a number of services that are available online for displaying other people's rss on your own page, without all the mussing with PHP. (Feedburner or Feed Digest are services that I'd recommend for doing this type of thing, if you want to go that route.) But who wants to mess with a bunch of javascript calls to someone else's server? And you get stuck with limits on the number of feeds you can run. And the there's the annoying "powered by ..." sticker at the bottom. And you'd have to use a third-party RSS splicer to combine all of your feeds.

So forget all that, because this isn't just a wonky personal project — it will hopefully end up being part of the excellent Piedmont Biofuels website — so it needs to be quick and hosted on the server.

So last night I opened up the latest installation of MagpieRSS and installed it on my server, created all of the necessary php for each of the blogs, and I ended up with a decent document. The major problems with this first version (using just the Magpie class) is the inconsistent treatment of the posts — some appear and some don't — and the improper encoding of the blogs. (I went 'round and 'round with the encoding. It's a common problem, but I couldn't get those damn posts clean.) Probably a few days in the Magpie listserv archives at Sourceforge would clear all of this up ... but the archives are exceptionally annoying, the Magpie blog is down, and the first version was still surprisingly slow anyway, even with the cache working.

So I found another solution, Alan Levine's Feed to JS, which is built on Magpie. This is an excellent free (and libre) service that has both hosted version and downloadable script. (It relies on the magpie class, but uses javascript to display the results., giving the added benefit of having an administrator's page that simplifies some of the options for display (such as the number of posts), and it yields much more compliant utf8 encoding (no more bloody diamond question marks in place of fancy quotes).

The downside: there is currently no way to "splice" all of the feeds together before running them through the javascript, so you end up calling the js file for each feed you parse. I felt sure this would make it to slow to be usable, but I think (hope) I was wrong, even with 25 blogs on the same page. I trimmed each of the blogs to display only 3 posts anyway, so at least the compiled filesize is real slim.

(Actually, while I'm writing all this out, I should bother to mention that there are a hell of a lot of 3rd-party RSS splicers/combiners ... but, again, they're all third party, and they seem to go extinct quickly: e.g. the defunct rollup.org. Most of these also have ads, are not free, or have limitations on the number of feeds, like feed digest. I was surprised and disappointed that I couldn't find something to install on the server that would take care of this — somebody please let me know if there's something reliable out there. This would allow me to combine all the posts and just run the JS once.)

So, from a programmer's view, this is a little inelegant, but the result is really consistent, and it still comes in at 8.5 seconds on 56K. The (minimally styled) latest version is here.