Youssef Eldakar, Head of the International School of Information Science at Bibliotheca Alexandrina and the IIPC Chair 2023-2024
We are already in the second quarter of a new IIPC year. 2023 marks the first year since 2019 that we finally are able to meet in person again at the General Assembly and the Web Archiving Conference, this time kindly hosted by the Netherlands Institute of Sound and Vision in Hilversum and co-organised by KB, National Library of the Netherlands. I have the pleasure of chairing the Consortium in 2023, a year that also marks 20 years of us working together to preserve content on the web. I would like to start by thanking Abbie Grotke of the Library of Congress (2021 Chair, 2022 Vice Chair) and Kristinn Sigurðsson of the National and University Library of Iceland (2021 Vice-Chair, 2022 Chair) for their leadership in 2021 and 2022. I would also like to thank Ian Cooke of the British Library (2022 Treasurer) for continuing on in his role as the IIPC Treasurer this year.
Time flies when we are having fun, so it is very hard to believe that it has already been 20 years of working together as a consortium to forward the domain of digital preservation of web content. Our anniversary year officially starts in July (IIPC was founded in Paris on July 24, 2003), but we would like to use our meeting in Hilversum to reflect on lessons learned and what we have achieved as a community over the past two decades. Our shared endeavor of capturing and preserving the ever-changing web and creating sustainable programs and practices is aptly captured by both this year’s conference theme, “Resilience and renewal,” and the following statement: “Web archiving practice has needed to demonstrate resilience in the face of the challenges. It has also required sustained innovation and renewal to find novel and practical ways to try to overcome obstacles and to demonstrate (and add to) the value of web archiving programs.”
What cannot be underestimated is the amount of knowledge that has accumulated over the past 20 years, which both web archiving practitioners and researchers have so generously shared at various IIPC events including our annual General Assembly and Web Archiving Conference, as well as IIPC projects, working groups and task forces. The IIPC GA has now been held 18 times across the world and online, and our members and IIPC staff have organised and contributed to many more workshops, webinars, training events, member updates, working group meetings, and technical calls.
I would like to thank the University of North Texas University Libraries and IIPC Staff for ensuring that these valuable resources are preserved, searchable and easily accessible through the IIPC collections at UNT. IIPC members and the wider web archiving community have also produced publicly available documentation of tools, including the Awesome Web Archiving List. Thanks to the efforts of the IIPC Training Working Group, we now also have training materials introducing a wide range of web archiving topics which have been used extensively.
Interestingly, this IIPC twentieth anniversary year overlaps with that of Bibliotheca Alexandrina. In 2002, just a few months before the inauguration of the revived Library of Alexandria, Brewster Kahle (founder of the Internet Archive) was in Alexandria, Egypt on a mission where the IA team and the BA team worked together to install a copy of IA’s web archive collection at the time inside the new library’s building. In addition to putting in place web archiving as a function of the library before its opening, such initiative was quite symbolic, alluding to the idea that preserving the web in modern times is rooted in preserving human knowledge regardless of the medium, be it papyrus scrolls in the ancient library centuries ago or digital documents on computer storage in the 21st Century.
IIPC started with 12 founding members: 11 national libraries and the Internet Archive. Over 20 years, the consortium has expanded to 54 members and now also includes university libraries, audiovisual institutes, service providers and an independent open-source project. I would like to take this opportunity to welcome our newest members, Smithsonian Libraries and Archives (joined in 2022) and Webrecorder (joined in 2023).
First In-Person Event in Almost 4 Years
The IIPC has made it through recent times during which all interaction was virtual and we have even expanded our activities. For three consecutive years, from 2020 to 2022, due to the Coronavirus pandemic, the IIPC held all its activities and events, including the annual General Assembly and Web Archiving Conference, online. While we are grateful that technology allowed us to go on uninterrupted to carry out the functions of the Consortium and even expand the audience of some of our events, we are also grateful for the opportunity to gather again as an in-person community this year, an experience which has proven over the years to be an opportunity for a truly dynamic interchange of ideas and experiences for many in the community.
One of our key IIPC-funded activities that continues this year is our Collaborative Collections, which are led by the Content Development Working Group (CDG) and supplemented by contributions from the community. Through this effort, the IIPC has been doing its fair share of building web archive collections that are transnational in scope and thus fitting to the consortium’s internationally diverse makeup. These collections offer members the chance to contribute to important global collections that may otherwise be outside of member organisations’ collecting scope. Last year, we notably launched a collection on archiving the War in Ukraine. This and many other collaborative collections are available both through Archive-it and Bibliotheca Alexandrina. This new access brings together web archive collections and tools developed by IIPC members: LinkGate and SolrWayback.
BA is currently in the process of moving the Solr index for the SolrWayback access interface to higher-end storage in order to achieve adequate search performance. This year, we will continue working with the IIPC Research Working Group on mapping possibilities for researcher use of our collections through these additional access points.
One of the IIPC’s key goals is to foster the development of tools for web archiving. While this year marks two decades for the IIPC since its founding, it also marks the one major step forward toward a more complete set of tools to cover key aspects of the web archiving process: capture, playback and analysis.
Following our members’ interests and needs, since last year, IIPC has been supporting the development of Webrecorder’s Browsertrix Cloud. This two-year project, led by The British Library, National Library of New Zealand, University of North Texas Libraries and the Royal Danish Library, is an excellent example of an IIPC collaborative approach to tools development. While the work on this continues with significant support from the community involved in testing the tool, our members have already presented on their use of Browstertrix at IIPC workshops, webinars and, most recently at the WAC 2023 Online Day and in-person conference.
In the past few years, IIPC has supported members in transitioning from OpenWayback, a tool developed and maintained by IIPC members, to Python Wayback, or pywb, developed by Ilya Kreymer of Webrecorder. As I mentioned earlier, one of the strengths of our community is the willingness to share knowledge. In addition to informal calls and exchanges on the pywb deployment at IIPC member institutions, our members have been presenting their processes at IIPC webinars, a use cases series that will continue later in 2023.
This brings me to tools for analysis and research. As we are getting more and more excellent examples of research use presented at our Research Webinar Series and at WAC, we have also been following the development of tools for analysing web archives. I have already mentioned SolrWayback, which was developed at the Royal Danish Library and featured at multiple IIPC conferences, including this year, and I would also like to congratulate the Internet Archive and the Archives Unleashed on creating ARCH (Archives Research Compute Hub) – also presented at this year’s WAC. AWAC2 (Analysing Web Archives of the COVID-19 Crisis) is one of the projects that used the new research hub to analyse IIPC’s biggest collection, and we are hoping that the combination of access through Archive-It and the BA mentioned earlier will draw more researchers to our collaborative collections.
Members Survey 2023
There has been a growing interest in computational access to web archives and engagement with researchers, and many of our members have been working on making their collections available to researchers. As a response to this trend, we have made sure that our new members survey includes a detailed section on the topic. The survey, based on one we conducted in 2017, also covers many other main areas in web archiving programs. The results of the previous survey helped us shape our current Strategic Plan and Consortium Agreement. Our hope is that the new survey will allow us the opportunity to further shape IIPC strategic priorities and guide our funding programs. Created jointly by the Membership Engagement Portfolio and IIPC Staff, the survey is another key activity for this year which will allow us to better serve the web archiving community.
The IIPC has been investing significant time and resources over the years into developing tools for web archiving to serve the consortium’s primary goal of preserving Internet content. An investment in tools development is only truly rewarding when the developed tools find their way to the right hands that will put them to effective use. The Tools Development Portfolio and the IIPC Senior Program Officer have made a proposal to provide technical training to IIPC members and anyone else interested in learning the technology used by web archiving institutions. As part of the survey mentioned earlier, we will be gathering information about the types of training that are most needed and the format that would be most suitable for our global membership.
Online activities and partner events
We are meeting in Hilversum in person, but one of our lessons learned from the past few years of online programming is the need to offer the conference to those who are unable to travel. As Kristinn pointed out last year, the COVID-19 pandemic made us an organisation for all seasons, and while we are celebrating the return to in-person events, we want to ensure that we can also provide an online forum serving the needs of members and the wider community. This year, we have already delivered an entire online day of WAC, and we will continue organising webinars, technical calls and other online events for members.
Last year we partnered with the Open Preservation Foundation (OPF), the Impact Centre of Competence in Digitisation (IMPACT), to deliver a widely popular online panel on the preservation of digitised and born-digital collections. Our Training Working Group also co-organised a workshop on advocacy with the Digital Preservation Coalition. The Partnerships and Outreach Portfolio and the SPO will continue to work on furthering our advocacy efforts, exploring future collaborations with organisations with interests overlapping with the IIPC’s strategic priorities. We are glad to see OPF, Dutch Digital Heritage Network (DDHN) and IIPC’s administrative and financial host, Council on Library and Information Resources (CLIR), at this year’s annual event and we are looking forward to an August 2023 in-person conference in Germany organised by Nestor in partnership with the IIPC.
All of our activities are only possible thanks to our members who contribute their time and expertise, as well as offering their institutions as a home for our in-person events. We are very grateful to the Netherlands Institute for Sound and Vision for hosting this year’s GA & WAC, offering both a unique and beautiful conference location and staff time spent helping with on-site logistics and conference organization. We’re also extremely grateful to KB, National Library of the Netherlands for their work in co-organising the conference, including generous sponsorship, contributions to the conference program, and significant amounts of staff time. Our consortium’s work is really built on collaboration, and this year’s co-organised conference is no exception. Thank you to all of our members who have volunteered their time and skills, whether by work in one of Portfolios, Task Forces, or Working Groups, time spent helping with the annual conference, reviewing the conference proposals, showcasing their work in a webinar or workshop, or otherwise.
What we do is also made possible thanks to the IIPC staff, hosted at CLIR, who are responsible for driving all of the IIPC activities and supporting our collaborative efforts to preserve the web. One of our recent strategic goals has been to “strengthen organizational resilience via both increased engagement with the Consortium Financial and Administrative Host and additional support staff (workforce development for increased efficiency, productivity and continuity).” Since 2022, we have finally had two full-time staff members for the first time. This has made us a much more robust organisation and has allowed us to significantly reduce our dependence on the volunteer model. I would like to thank Olga Holownia, our Senior Program Officer, and Kelsey Socha-Bishop, our Administrative Officer, for all their hard work.