The History of the IIPC, through Web Archives

By Nicholas Taylor, Web Archiving Service Manager, Stanford University

Web archives have now been around long enough that the web content they’ve preserved may never have been previously experienced by full-grown adults today; to this cohort, some websites were only ever “historical.” Web archives represent an increasingly vital and singular body of cultural heritage and a tool for understanding both the past and social phenomena. They’re also a handy tool for understanding the evolution of the IIPC itself.

netpreserve.org_2015

home page of the IIPC website, 16 March 2015

While I trust that our own programmatic record-keeping would be sufficient to reconstruct some of the following findings, they would also be thankfully self-evident to a future historian (one unusually interested in the history of the history of the Web) from the web archives themselves. Consulting the UK Web Archive front-end for the IIPC-funded, LANL-developed and -hosted Memento Aggregator shows that Internet Archive has the greatest number of snapshots of the entire history of the IIPC’s web presence.

Here’s some of what I learned, exploring the timeline:

netpreserve.org_2004

home page of IIPC website, 3 june 2004

I imagine that these latter three points especially will be interesting to consider in the context of our forthcoming discussions for a new membership agreement to replace the one expiring this year (PDF) and to inform refined IIPC mission and goals. Here’s hoping that the most exciting history of the history of the Web is still ahead of us!

What’s Next for OpenWayback

By Kristinn Sigurðsson, Head of IT at National and University Library Iceland. Cross posted from his own blog

About one month ago, OpenWayback 2.1.0 was released. This was mostly a bug-fix release with a few new features merged in from Internet Archive’s Wayback development fork. For the most part, the OpenWayback effort has focused on ‘fixing’ things. Making sure everything builds and runs nicely and is better documented.

I think we’ve made some very positive strides.

Work is now ongoing for version 2.2.0. Finally, we are moving towards implementing new things! 2.2.0 still has some fixing to do. For example, localization support needs to be improved. But, we’re also planning to implement something new, support for internationalized domain names.

We’ve tentatively scheduled the 2.2.0 release for “spring/early summer”.

After 2.2.0 is released, the question will be which features or improvements to focus on next. The OpenWayback issue tracker on GitHub has (at the time of writing) about 60 open issues in the backlog (i.e. not assigned to a specific release).

We’re currently in the process of trying to prioritize these. Our current resources are nowhere sufficient to resolve them all. Prioritization will involve several aspects, including how difficult they are to implement, how popular they are and, not least, how clearly they are defined.

This is where you, dear reader, can help us out by reviewing the backlog and commenting on issues you believe to by relevant to your organization. We also invite you to submit new issues if needed.

It is enough to just leave a comment that this is relevant to your organization. Even better would be to explain why it is relevant (this helps frame the solution). Where appropriate we would also welcome suggestions for how to implement the feature. Notably in issues like the one about surfacing metadata in the interface.

If you really want to see a feature happen, the best way to make it happen is, of course, to pitch in.

Some of the features and improvements we are currently reviewing are:

  • Enable users to ‘diff’ different captures of an HTML page. Issue 15.
  • Enable search results with a very large number of hits. Issue 19.
  • Surface more metadata. Issue 28and 29.
  • Enable time ranged exclusions. Issue 212.
  • Create a revisit test dataset. Issue 117.
  • Using CDX indexing as the default instead of the BDB index. Issue 132.

As I said, these are just the ones currently being considered. We’re happy to look at others if there is someone championing them.

If you’d like to join the conversation, go to the OpenWayback issue tracker on GitHub and review issues without a milestone.

If you’d like to submit a new issue, please read the instructions on the wiki. The main thing to remember is to provide ample details.

We only have so many resources available. Your input is important to help us allocate them most effectively.