IIPC Steering Committee Election 2019

The nomination process for IIPC Steering Committee is now open.

The Steering Committee is the executive body of the IIPC, currently comprising 15 member organisations, that take a leadership role in the high-level strategic planning, development and management of programs, policy creation, overall administration, and contribution to IIPC Portfolios and other activities.

What is at stake?

Serving on the Steering Committee is an opportunity for motivated members to help guide the IIPC’s mission of improving the tools, standards and best practices of web archiving while promoting international collaboration and the broad access and use of web archives for research and cultural heritage. Steering Committee members are expected to take an active role in leadership, contribute to SC and Portfolio activities, and help guide and administer the organisation.

Who can run for election?

Serving on the Steering Committee is open to any current IIPC member and we strongly encourage any organisation interested in serving on the Steering Committee to nominate themselves for election. SC members are elected for 3 years and meet twice a year in person, once during the General Assembly, once in September and two or more additional times by teleconference.

Please note that the nomination should be on behalf of an organisation, not an individual. Once elected, the member organisation designates a representative to serve on the Steering Committee. The list of current SC member organisations is available on the IIPC website.

How to run for election?

All nominee institutions, both new and existing members whose term is expiring but are interested in continuing to serve, are asked to write a short statement (max 200 words) outlining their vision for how they would contribute to IIPC via serving on the Steering Committee. Statements can point to past contributions to the IIPC or the SC, relevant experience or expertise, new ideas for advancing the organisation, or any other relevant information.

All statements will be posted online and emailed to members prior to the election with ample time for review by all membership. The results will be announced in mid-May and the three-year term on the Steering Committee will start on 1 June.

Below you will find the election calendar. We are very much looking forward receiving your nominations. If you have any questions, please contact the IIPC PCO.

.


Election Calendar

  •  12 November to 1 March: Members are invited to nominate themselves by sending an email including a statement to the IIPC Programme and Communications Officer.
  • 1 April: Nominees statements are published on the Netpreserve Blog and Members mailing list. Nominees are encouraged to campaign through their own networks.
  • 1 April to  30 April: Members are invited to vote online. An online voting tool will be used to conduct the vote. The PCO will monitor the vote, ensuring that each organisation votes only once for all nominated seats and that the vote is cast by the organisation’s official representative. People will be encouraged to cast their vote before, during, and after the GA.
  • 30 April: Voting ends.
  • 1 May: The results of the vote are announced officially on the Netpreserve blog and Members mailing list.
  • 1 June: end/start of SC members terms. The newly elected SC members start their term on the 1st of June and are invited to attend a first meeting (by teleconference) by the end of June. The next face to face SC meeting will take place in Zagreb in June 2019.

 

Advertisements

Online Hours: Supporting Open Source

By Andrew Jackson, Web Archiving Technical Lead at the British Library

At the UK Web Archive, we believe in working in the open, and that organisations like ours can achieve more by working together and pooling our knowledge through shared practices and open source tools. However, we’ve come to realise that simply working in the open is not enough – it’s relatively easy to share the technical details, but less clear how to build real collaborations (particularly when not everyone is able to release their work as open source).

To help us work together (and maintain some momentum in the long gaps between conferences or workshops), we were keen to try something new, and hit upon the idea of Online Hours. It’s simply a regular web conference slot (organised and hosted by the IIPC, but open to all) which can act as a forum for anyone interested in collaborating on open source tools for web archiving. We’ve been running for a while now, and have settled on a rough agenda:

Full-text indexing:
– Mostly focussing on our Web Archive Discovery toolkit so far.

Heritrix3:
– including Heritrix3 release management, and the migration of Heritrix3 documentation to the GitHub wiki.

Playback:
– covering e.g. SolrWayback as well as OpenWayback and pywb.

AOB/SOS:
– for Any Other Business, and for anyone to ask for help if they need it.

This gives the meetings some structure, but is really just a starting point. If you look at the notes from the meetings, you’ll see we’ve talked about a wide range of technical topics, e.g.

  • OutbackCDX features and documentation, including its API;
  • web archive analysis, e.g. via the Archives Unleashed Toolkit;
  • summary of technologies so we can compare how we do things in our organisations, to find out which tools and approaches are shared and so might benefit from more collaboration;
  • coming up with ideas for possible new tools that meet a shared need in a modular, reusable way and identify potential collaborative projects.

The meeting is weekly, but we’ve attempted to make the meetings inclusive by alternating the specific time between 10am and 4pm (GMT). This doesn’t catch everyone who might like to attend, but at the moment I’m personally not able to run the call at a time that might tempt those of you on Pacific Standard Time. Of course, I’m more than happy to pass the baton if anyone else wants to run one or more calls at a more suitable time.

If you can’t make the calls, please consider:

My thanks go to everyone who as come along to the calls so far, and to IIPC for supporting us while still keeping it open to non-members.

Maybe see you online?

Web Archivists, Assemble!

By Alex Thurman, Columbia University Libraries, Member of the IIPC Steering Committee and the WAC Program Committee (2016-2018), Co-Chair of the Content Development Group

The IIPC General Assembly & Web Archiving Conference is the professional gathering I anticipate most eagerly each year. In an energizing atmosphere of international cooperation, web curators, librarians, archivists, tool developers, computer scientists, and academic researchers from member organizations and beyond meet to share experiences and best practices and plan projects to tackle the collective challenge of preserving web resources.

I’ve had the good fortune of attending each year since 2012, and for the past three years I’ve also had the rewarding experience of serving on the program committees planning these events. As we look forward to the exciting upcoming 2018 conference in Wellington, New Zealand, here is some background on the recent evolution of the GA/WAC and the work of the 2018 WAC Program Committee.

Recent background

2018 marks the fifteenth anniversary of the IIPC, and the twelfth consecutive year that members of the IIPC will come together in an annual General Assembly. The IIPC Steering Committee has striven to cycle (loosely, as dependent on members volunteering to host the event) the venue of the GA/WAC in alternate years between Europe, North America and Australasia. And from the start, the GA event programs have combined days reserved for IIPC members (focused on Consortium planning and working group activities) with one or more open days to welcome the perspectives and expertise of the wider web archiving community and of researchers.

To emphasize this aspect of outreach to researchers and promoting awareness of web archiving, the Steering Committee has in recent years opted to formalize the “open days” as a distinct event—the IIPC Web Archiving Conference. The 2016 event was the first to thus distinguish the General Assembly from the Web Archiving Conference, and thereafter, at the suggestion of that PC’s Chair (Kristinn Sigurðsson, National and University Library of Iceland), planning responsibility for the different event components became more distributed: the GA program would be determined by the Steering Committee Officers and Portfolio Leads and the Working Group Chairs; a mostly local Organizing Committee would see to the logistical planning of securing a venue and catering and possible sponsors; and the Web Archiving Conference program would be developed by a Program Committee. The 2017 Program Committee (chaired by Nicholas Taylor, Stanford University) was the first to include some non-IIPC members, and their CFP was the first to attract more relevant submissions than we had space to accept, a milestone in the maturation of the conference.

Work of the 2018 Program Committee

Co-chaired by Jan Hutař (Archives New Zealand) and Paul Koerbin (National Library of Australia), the 12-member 2018 Program Committee started work in November 2017. Our first task was drafting a call for papers, which involved first discussing whether the conference would have a stated theme and the types (presentations, panels, workshops, tutorials) of submission proposals we’d ask for and the nature of the submission (abstracts? full papers?). We needed a flexible theme that would acknowledge the IIPC’s milestone 15th anniversary and the value of our collective work preserving the web so far, while embracing creative new approaches to the evolving challenges we face. In his draft CFP, Paul Koerbin hit on “Web Archiving Histories and Futures and we ran with that. And as the Wellington event will be the first GA/WAC held in Australasia in 10 years, we especially encouraged submissions related to Asia/Pacific web archiving activities.

To encourage submissions from all types of web archiving practitioners and users, in the CFP we further listed some suggested topics, under the rubrics of “building web archives,” “maintaining web archive content and operations,” “using and researching web archives,” and “web archive histories and futures.” And we opted to ask applicants to submit abstracts only rather than full papers, both to lower the barriers to application in order to get more submissions, and to allow all Program Committee members to consider (and vote on) all submissions, rather than assigning reviewers to specific papers. Once the CFP was ready, PC members worked hard to distribute it to a wide selection of mailing lists, reaching beyond IIPC members and other cultural memory institutions to also get submissions from independent researchers.

This strategy worked (boosted no doubt by the intrinsic appeal of visiting Wellington!), as we received a record number of submissions for the WAC, submitted through EasyChair. The breadth and depth of interesting submissions allowed us to build a strong program–while unfortunately having to reject some relevant proposals. Each committee member read all the submitted abstracts and rated each one on a 3-point scale, yielding cumulative point averages for each submission from which the committee could decide which submissions would be accepted for the conference. In order to know how many submissions could be accepted we first had to consider how much conference schedule time we had available, which would depend in part on whether we would have multiple tracks.

We decided the program would have a mix of plenary talks and usually two tracks of presentations or workshops, and Olga Holownia (IIPC Program & Communication Officer) provided a range of detailed schedule templates for us to use to figure out how many individual presentations, panels, and workshops we’d have room for. We then began grouping accepted proposals into thematic sessions, loosely conceived as more-technical and less-technical tracks, in order to reduce (though not eliminate) the frustration of attendees wishing they could be in both tracks at once. Committee members then divided up the responsibility of serving as session chairs, to introduce the speakers and keep the sessions running on time.

Between the tasks of preparing the CFP and evaluating the submissions and shaping them into a program, the committee had the additional enjoyable responsibility of brainstorming possible keynote speaker candidates. Committee members suggested over two dozen possible keynoters, voted on them, and eventually submitted a few outstanding candidates to the Organizing Committee for their consideration. The Organizing Committee took these suggestions and added others based on their familiarity with the Australasian digital library and academic scene and delivered two exciting keynote speakers – Wendy Seltzer (World Wide Web Consortium) and Rachael Ka’ai-Mahuta (Te Ipukarea, the National Maori Language Institute, Auckland Institute of Technology) – and an additional plenary talk from Vint Cerf (Google). With these and many other talented contributors from within and beyond IIPC member institutions, the 2018 IIPC Web Archiving Conference looks to be a rich and stimulating event.

Register now!

Serving on the WAC Program Committee is a great opportunity to work directly with IIPC colleagues and other web archiving enthusiasts. And the work continues – you can volunteer now to serve on the Program Committee and start shaping the 2019 IIPC WAC.

A personal reflection on the IIPC WAC

By Gillian Lee, Coordinator, Web Archives at the National Library of New Zealand, Member of the IIPC Steering Committee and the WAC Program Committee

This year I’ve had the privilege of being part of the programme committee for IIPC WAC. Reading through the abstracts that many of you sent in gave me a real sense of excitement about the work that we are all involved in. That caused me to reflect on the benefits of the IIPC conference and what it means to us as members. Some of you might attend these conferences on a regular basis, others may never have had that opportunity.

I’ve been web archiving for 11 years and have been fortunate to attend 3 IIPC conferences during that time. It’s rare for me to attend a conference that’s actually about the work I do, so I really value those times! It’s an opportunity to finally meet people, who were formerly just names on mailing lists and blog posts. Getting together with other web archivists is invaluable, whether it’s talking to someone who is just starting out in the web archiving world, sharing the struggles of budget constraints, or learning more about what members are doing. You can’t beat that!

Even in this digital age it’s easy to feel isolated here in New Zealand when we hear so much about web archiving developments, especially in Europe and the States. There’s only so much you can learn from emails, blog posts and the odd webinar that’s not scheduled for 2am NZ time!!

Despite the distance we have collaborated with other IIPC members over the years. Back in 2006 the National Library of New Zealand worked with the British Library to build Web Curator Tool (WCT). The BL have moved on and developed other tools since then, and this year we’ve collaborated with National Library of the Netherlands in a major upgrade to WCT. Kees Teszelszky blogged about this recently. You can find out more about it during the IIPC conference in Wellington in November.

We’ve also been involved with the Content Development Working Group by submitting seed lists to collaborative collections such as the Olympic Games, World War One Commemoration and the News around the World project. If you’re new to IIPC, do consider getting involved in one of the IIPC groups.

We’re really excited to be hosting IIPC this year and look forward to meeting you all in person! A number of my colleagues have never had the chance to attend an IIPC conference, so they’re in for a treat! See you soon!

Mark_Beatty-NLNZ
National Library of New Zealand, Photo by Mark Beatty / CC BY-NC 3.0 NZ.

Welcome to WAC in Wellington

By Peter McKinney, Digital Preservation Policy Analyst at the National Library of New Zealand and the Chair of the IIPC 2018 General Assembly and Web Archiving Conference Organising Committee

National Library of New Zealand Te Puna Mātauranga o Aotearoa.

I remember my first time in New Zealand. It was wonderful. But I do remember commenting to my partner, as we sat on the tarmac in Auckland, that I couldn’t live here as it was too far away from anything (I lived in Scotland at the time).

Just over a year later I moved to Wellington.

I’m not sure whether this shows my unerring ability to change my mind at a whim, or the strength of what I found over here. I hope the latter. The travel for visitors is well worth it. Wellington and New Zealand are amazing. And while the work of the National Library has attracted a number of us to come and live our lives here, it is the country that makes it home.

It is therefore my great honour to be part of the team that is welcoming you here. The National Library of New Zealand feels greatly priveleged to be hosting this year’s IIPC General Assembly and Web Archiving Conference. The Library has received great benefit from being a member of the IIPC over the years and to be able to entice members and the wider web archiving community all the way down to the South Pacific is an amazing opportunity for us. We can open up participation to those who just have not been able to travel those distances up to the northern hemisphere. It is also a great chance for us to show off what we have down here.

I have two primary responsibilities in my role as Chair of the Organising Committee. The first is to ensure that IIPC members have a productive week. This means providing a comfortable environment where members can get their business done and enjoy everything Wellington has to offer. My second responsibility is that “locals” (New Zealanders and our pacific neighbours) are able to take advantage of the experience and expertise that will be converging at the Library; this is a precious opportunity that will not come round again in the foreseeable future.

The website has a host of information about the GA and WAC, and I encourage you to check it out (and get in touch if need more information). Alex Thurman has written about the work of the programme committee pulling together what is a brilliant selection of papers, panels, posters, tutorials and workshops. Gillian Lee has also covered off what it means to staff in the National Library to be able to have the IIPC event down here in Wellington.

Personally, I can’t wait to hear from our keynote speakers (Rachael Ka’ai-Mahuta and Wendy Seltzer). They have been asked to challenge us and make us pause and consider what the future of web archiving may look like. Vint Cerf needs no introduction and we are incredibly grateful that he has accepted our invitation to share his current thinking with us. We’re also having a public event on Tuesday, which we will be announcing in the next few weeks.

The week will be busy and hopefully, productive and inspiring. I also can’t encourage you enough to explore Wellington and beyond if you have time. There is, of course plenty of time to sleep on the plane on the way back!

IIPC Content Development Group: What’s on in 2018

by Nicola Bingham, Lead Curator, Web Archiving British Library and IIPC CDG Co-Chair

The co-chairs of the IIPC Content Development Group  (CDG) are pleased to submit the following update on the group’s activity so far this year and the major projects which will occupy the group going forward in 2018.

What do we do?

For those new to the IIPC or those who may be interested in either contributing to planned collections or thinking about submitting ideas for new ones, it is worth revisiting the CDG’s mandate.

The CDG was formed in 2014 and crawling began in early 2015. The Group is charged with building publicly accessible web collections on transnational themes or events. Collections are multinational, multilingual and cover a wide variety of perspectives. They are intended, not only to be of particular value to researchers now and in the future but also to promote awareness of web archiving globally, encouraging individuals and institutions not involved in web archiving, or wanting to become involved to find out more.

How to propose a collection?

New collections can be proposed on the CDG member’s mailing list, where the CDG co-chairs and the group (sometimes with consultation with researchers and others) develop a list of collections to pursue in line with pre-defined criteria in the collection policy and our capacity according to the budget approved by the Steering Committee. Each collection is supported by the co-chairs who serve as project admins while a lead curator, often the person who proposes the collection, but not necessarily, scopes the collection, determines the metadata, monitors the collection and leads on quality assurance. Each collection is open to all members to contribute to. We strive to open up the nomination procedure as widely as possible, to non-members and members of the public, to elicit as wide a coverage of particular topics as possible.

Collections developed so far, via the IIPC Archive-It account, can be viewed here https://archive-it.org/home/IIPC

2018 collecting

So far in 2018 we have completed the 2018 Winter Olympics & Paralympics Collection, which contains nearly 1,500 seeds and is 1.2TB of data. The collection covered 35 countries in 21 Languages. The nominations came from a mix of IIPC members and a public nomination form that was available through previous blog posts. For more information on this collection see lead curator, Helena Byrne’s blog posts.

In addition, we updated the National Olympic & Paralympic Committees collection with committees that were missing from the crawl in 2016. This collection was crawled again during the 2018 Winter Olympics & Paralympics. Not all National Committees have a website, but if you notice we are missing any websites get in touch (2018-winter-olympics [at] iipc.simplelists .com).

We are now turning our attention to resuming the World War I Commemoration and the ‘Online News around the World’ collections.

The World War I Commemoration project led by Peter Stirling, BnF, started in October 2015. It already includes over 2,000 seeds and covers a wide variety of different websites from official commemorations to amateur history websites, and the reporting of the centenary in the media. Websites from several different countries and many languages have been selected by the members’ of the IIPC. 2018 is an important year for this collection as we will be looking to capture activity leading up to and during the centenary of the armistice in November.

The ‘Online News around the World’ collection has been several years in planning, led by, Sabine Schostag, the Royal Danish Library, and will begin in earnest shortly. This ambitious project aims to document a selection of online news websites from as many countries as possible  in the world during one week of the year (likely to be in November 2018). Once the metadata has been finalised, we will post details of how to nominate content for this collection.  The IIPC has members in over 34 countries around the world which is already a good starting point but we hope to canvas much more widely than this to achieve our goal of global coverage!

This summer we will also be running new crawls of the seeds in the International Cooperation Organizations collection, led by Alex Thurman from Columbia University Libraries, which consists of all known active websites in the .int top-level domain (available only to organizations created by treaties). This collection was started in 2016 and includes important agencies in areas that require international cooperation, like environmental protection, economic development, and telecommunication.

In the meantime, we hope to see as many CDG members as possible for our session at the IIPC General Assembly on 12th November –  more details to follow shortly.

World Wide Webarchiving: Upgrading the Web Curator Tool

by Kees Teszelszky, Curator digital collections, National Library of the Netherlands

The Web Curator Tool (WCT) is a workflow management application designed for selective web archiving. It was created for use in libraries and other digital heritage collecting organisations, and supports collection by non-technical users while still allowing complete control of the web harvesting process. The WCT is a tool that supports the selection, harvesting and quality assessment of online material when employed by collaborating users in a library environment. The application is integrated with the existing Heritrix web crawler and supports key processes such as permissions, job scheduling, harvesting, quality review, and the collection of descriptive metadata. The WCT allows institutions to capture almost any online resource. These artefacts are handled with all possible care, so that their integrity and authenticity is preserved.

The WCT was developed in 2006 as a collaborative effort by the National Library of New Zealand (NLNZ) and the British Library (BL), initiated by the International Internet Preservation Consortium (IIPC) as can be read in the original documentation. The WCT is open-source and available under the terms of the Apache Public License. The project was moved in 2014 from Sourceforge to Github. The latest ‘binary’ release of the WCT, v1.6.3, was published in July 2017 on the Github page of NLNZ. Even after 12 years, the WCT still continues as one of the most common, open-source enterprise solutions for web archiving. It has an active user forum on Github and Slack.

From January 2018 onwards, NLNZ has been collaborating to upgrade the WCT with the Koninklijke Bibliotheek – National Library of the Netherlands (KB-NL) and adding new features to make the application future-proof. This involves learning the lessons from the previous development and recognising the advancements and trends occurring in the web archiving community. The objective is to get the WCT to a platform where it can keep pace with the requirements of archiving the modern web. Further, the Permission Request module will be extended to fit the Dutch situation which lacks a legal deposit for digital publications.

The first step in that process was decoupling the WCT from the old Heritrix 1.x web crawler, and allowing the WCT to harvest using the updated Heritrix 3.x version. A proof of concept for this change was successfully developed and deployed by the NLNZ, and has been the basis for a joint development work plan. The project will be extensively documented.

The NLNZ has been using the WCT for its selective web archiving programme since January 2007, KB-NL since 2009. In 2008 NLNZ published an article describing their experience using WCT in a production environment. However, the software had fallen into a period of neglect, with mounting technical debt: most notably its tight integration with an out-dated version of the Heritrix web crawler. While the last public release of the WCT is still used day-to-day in various institutions, this release has essentially reached its end-of-life as it has fallen further and further behind the requirements for harvesting the modern web. The community of users have echoed these sentiments over the last few years.

During 2016-2017 the NLNZ conducted a review of the WCT and how it fulfils business requirements, and compared the WCT to alternative software/services. The NLNZ concluded that the WCT was still the closest solution to meeting its requirements – provided the necessary upgrades could be done, namely a change to use the modern Heritrix 3 web crawler. Through a series of fortunate conversations the NLNZ discovered that another WCT user, KB-NL, was going through a similar review process and had reached the same conclusions. This led to collaborative development between the two institutions to uplift the WCT technically and functionally to be a fit for purpose tool within these institutions’ respective web archiving programmes.

Who are involved:

National Library of New Zealand:

Steve Knight
Andrea Goethals
Ben O’Brien
Gillian Lee
Susanna Joe
Sholto Duncan

Koninklijke Bibliotheek:

Peter de Bode
Jeffrey van der Hoeven
Hanna Koppelaar
Tymen Kwant
Barbara Sierman
René Voorburg
Kees Teszelszky

Further reading: