Title of Primary Project Output
ArchivePress: Archiving blogs with blogs
Screenshots and description of prototype
Figure 1: The Home Page of an ArchivePress installation using the default theme created by the project (APrints)

Figure 2: An example of one of the Browse Views (“Browse By Tag”) of the ArchivePress repository, using the default APrints theme

Figure 3: The “Add Feed” form in the plugin Control Panel within the WordPress Admin Dashboard

Figure 4: The “Edit Post Feeds” form for managing archived feeds, part of the plugin Control Panel

End User of Prototype
Anyone wanting to create a functionally rich archive of blog posts across multiple blog authors and multiple hosts, that updates automatically as new posts and and comments are added to target blogs, should consider using WordPress with ArchivePress. (N.b. however, for retrospective archiving of “closed”, completed, single blogs, you are probably better off with HTTrack or Wget.)
Link to working prototype
Working prototype not currently available outside ULCC intranet.
Link to end user documentation
http://code.google.com/p/archivepress/wiki/GettingStarted
http://code.google.com/p/archivepress/wiki/FAQ
Link to code repository or API
http://code.google.com/p/archivepress/
Link to technical documentation
http://code.google.com/p/archivepress/wiki/InstallingArchivePress
Date prototype was launched
Public prototype available to download since March 2010. Demonstration site publicly available March – June 2010.
Project Team Names, Emails and Organisations:
Richard Davis, r.davis@ulcc.ac.uk, ULCC
Emanuele Fortunati, ULCC
Maureen Pennock, BL
Project Website
http://archivepress.ulcc.ac.uk
PIMS entry
https://pims.jisc.ac.uk/projects/view/1354
Table of Content for Project Posts
Blog Table of Contents
The ArchivePress plugin is now fully compatible with WordPress 3 (thanks Emanuele!). To date. our test installations have been using WP2.9.x, but the aim of AP is to work with the current version of WordPress, so that site managers have access to as many current plugins as possible that are in ongoing development.
One interesting recent plugin discovery, addressing current Semantic Web themes is the PoolParty SKOS plugin. This would seem to have potential to semantically enhance an ArchivePress blog archive. In combination with, say, Triplify, it could create an extremely data-rich blog archive. We’re not going to be able to do that during this project, but hope to try it out at some point. I’d be very interested to hear about and help with any other experiments with this.
You can download the latest package ArchivePress package for WP3 (tagged RC2) from Google Code.
With lots of other commitments pressing, it’s been another quiet time on the ArchivePress project, but now we are ready to enter a final phase of research and reporting into the effects of our activities.
While we’ve been away we have been running a demo instance of AP on a public web hosting service with encouraging results, which I’ll report on soon. It was always intended that the plugins should run successfully on a standard public installation of WordPress. Unfortunately this has had to be removed at short notice because of the one significant and unavoidable downside of unattended web harvesting: the exponential growth of content, and of scheduled tasks, as each blog post harvested by the repository also adds another comments feed that needs checking. Budget shared webspace hosts tend to get a bit twitchy if your PHP hogs the processor.
Fortunately Emanuele included with the plugin the option to use native server crontab feature, rather than the PHP pseudo-cron which is built in to WordPress. We now have a final demonstrator server ready where we can better control and monitor these activities. We will be reporting back on that in September and October.
We’ve had several enquiries about using ArchivePress. In most cases, however, these have related to retrospective blog archiving requirements, which are best dealt with by standard web harvesting approaches. ArchivePress has some capabilities to play “catch up”, but its main value is as a tool for dynamically harvesting active blog content. If we have a valuable active blog with legacy content, then using the ArchivePress import feature is worthwhile, but for a purely retrospective capture, I’m not sure it worth it, and I’d still recommend capturing the content with httrack or wget.
As well as monitoring the final pilot system, we also hope to look at some other issues that the project has raised about blog archiving, and how we can address them within ArchivePress. These include
- Use cases: what have we learned about blog archiving use cases and attitudes to blog preservation during the last year or so? Have expectations of blog archiving changed since the project’s inception?
- Embedding semantic metadata: the ArchivePress templates offer many possibilities for enriching and normalising blog metadata
- Persistent identifiers: what is the value and what are the possibilities of implementing persistent identifiers for posts and comments in an AP archive?
- SPARQL endpoints: can we usefully add a SPARQL interface to an AP archive?
- Cloud-hosting: as other repository applications move into the cloud, is this a direction that AP could support?
All this and more coming soon! (Possibly a new theme too, I never liked this one!)
The ArchivePress plugin is now a reality: we have just released a new Version that is working quite good.
It is still a Beta Version, but there are really only few minor bugs and the Admin Area has been completely redesigned and developed using the “WordPress style”: in this way the use of ArchivePress is now really intuitive and easy to learn.
If you want to know more about the ArchivePress core, please read this post or the Project Documentation, because today I would like to properly introduce the ArchivePress Admin Area:

This is the ArchivePress main menu: it allows you to access to the plugin Adim Pages.

We have also added some shortcuts to make things fast!

As you can see there are TWO dashboard Widgets: the first one is a Summury about the plugin status; the second one shows the Plugin Logs.

This is an example of an “Add/Edit” page. As you can see to make things faster and easier you have to fill just 2-3 fields and the plugin will fill the others for you. Anyway if you would like to change some settings you can simply open the “Advanced Settings” area and tune your Feed as you prefer!

And finally and example of a page the shows Post Feeds (or Comment Feed, or Authors or Logs,…). These pages have been developed to seems part of the WordPress Core: a part for the graphics in fact, you can discover
- The Page Info, which contains usefull information about the current Page
- The filters and the Search fields, which allow you to find the content you are looking for
- The Bulk Actions and the Line Actions, which are one of the most powerfull thing of the WordPress Admin Area
Still reading??? Go and Downaload the new ArchivePress Plugin Version and let us know how is working!
Day after day ArchivePress is getting a reality, and today I’m proud to announce the final Alpha Release. This version is currently in testing, but it seems working fine and it’s properly “grabbing” posts and comments from WordPress and Blogger blogs. You can download it as a zip file from Google Code.
The plugin is still missing a proper admin interface, but the most important pieces are already been developed and it won’t take us too much time to design a proper user area. Advanced configuration features, such as frequency of feed-polling, can currently be set in the plugin’s configuration; the aim is still to keep the main interface as uncluttered as possible.
Another thing I feel very proud about is the plugin is fast: the code has been written thinking about the performances and to allow the plugin to run without need a “special” server; in this way almost everybody could enjoy ArchivePress without any troubles concerning cpu, ram and bandwidth. I think (this is just a PERSONAL THOUGHT!) ArchivePress is the fastest, easiest and powerful WordPress plugin of its type.
In addition, Richard and I have developed a theme to highlight the key features of the plugin. Stay tuned here, because new interesting posts are coming: I’ll explain how the plugin is working and what is possible to do with it.
Next stop will be the Beta version, with improved admin Control Panel!
Hi there, this is Emanuele and this is my first post, but this is not to introduce myself but rather to introduce the new ArchivePress plugin for WordPress.
I’ve started 10 days ago having a look at the FeedWordPress plugin, used in the past as base/reference for ArchivePress, and I noticed it is a really good plugin, but really too complex (and messy) for “normal” people (as also Richard told me before!): basically we want a “feed aggregator” plugin based on the same functionality as FeedWordPress, but easier to use.
The idea we had is to offer a Really Simple Interface (inside the WordPress admin area) to allow users to add, edit, reset and remove and schedule harvesting from RSS and Atom feeds, leaving to the plugin all the other tasks, like manging options, post updates, scheduling, categories, tags, authors.
But there are limits to what we can automate, and a more experienced user will need more options to manage and properly configure the blog and each feed in a different way.
So, why don’t offer an advanced interface to these “expert users”? Simply, we will! And we won’t offer just advanced feed management procedures (add,edit,remove, schedule feeds), but also the possibility to modify manually a configuration file to use custom settings during the plugin activation: in this way users can decide whether to use advanced procedures every time they add a feed, or properly set the configuration file and then use these default settings for each feed.
Not everything is done, but I can say an alpha version is almost ready!