No Comment

When we started ArchivePress, it all looked so easy, not least because our experience and premise was based on an understanding of WordPress. What was particularly nice about Wordpress is that in the RSS/Atom feeds, each Post declares the URI of a feed of its Comments (wfw:commentRss). Thus each of those feeds is added to the list of feeds to poll, and to harvest Posts and Comments you need only know the Blog Feed URI: the logic behind it is appealingly elegant.

Not so, alas with all leading brands of blog. Blogger only offers a feed of Comments over the whole blog (e.g. http://myblog.blogspot.com/feeds/comments/default). This means that we need to seed the harvester with two Feed URIs (Posts and Comments), and pair the comments up with their post as they arrive. Similarly with Typepad the Comments feed is over the whole blog (as described by Andy Powell) and apparently not even configured by default.

This is why it’s been necessary to redesign the plugin approach to capturing comments. It’s also led us to a detailed analysis of the metadata available in feeds which will also be preserved as posts are gathered.

Making It Easy

It was brought home to me when I was trying to setup the AP2 demonstrator that the interface inherited from FeedWordPress is
much too complicated for our purposes. The aim of ArchivePress is to provide a simple way of archiving blogs and comments: FeedWordPress provides a number of options which I don’t think need concern someone trying to manage a blog archive. They even scare me, since I found that I couldn’t repeat some of the functionality I’d achieved with AP1. Ideally our ArchivePress control panel would be not-much-more complicated than this:

Enter URL of blog feed (mandatory)
Enter URL of blog comments feed (optional)
Import comments Y/N
Import embedded objects? Y/N

Some of the other options offered by FeedWordPress we will simply ignore, or pick the default most appropriate for our needs: the archiver needn’t be troubled by them.

I’m away for a few days, but when I get back will work with Rory to bring this to fruition, hopefully in time for us to show off with our poster at iDCC in December.

Leave a Reply