by Nicola Bingham, Lead Curator, Web Archiving British Library and IIPC CDG Co-Chair
The co-chairs of the IIPC Content Development Group (CDG) are pleased to submit the following update on the group’s activity so far this year and the major projects which will occupy the group going forward in 2018.
What do we do?
For those new to the IIPC or those who may be interested in either contributing to planned collections or thinking about submitting ideas for new ones, it is worth revisiting the CDG’s mandate.
The CDG was formed in 2014 and crawling began in early 2015. The Group is charged with building publicly accessible web collections on transnational themes or events. Collections are multinational, multilingual and cover a wide variety of perspectives. They are intended, not only to be of particular value to researchers now and in the future but also to promote awareness of web archiving globally, encouraging individuals and institutions not involved in web archiving, or wanting to become involved to find out more.
New collections can be proposed on the CDG member’s mailing list, where the CDG co-chairs and the group (sometimes with consultation with researchers and others) develop a list of collections to pursue in line with pre-defined criteria in the collection policy and our capacity according to the budget approved by the Steering Committee. Each collection is supported by the co-chairs who serve as project admins while a lead curator, often the person who proposes the collection, but not necessarily, scopes the collection, determines the metadata, monitors the collection and leads on quality assurance. Each collection is open to all members to contribute to. We strive to open up the nomination procedure as widely as possible, to non-members and members of the public, to elicit as wide a coverage of particular topics as possible.
Collections developed so far, via the IIPC Archive-It account, can be viewed here https://archive-it.org/home/IIPC
So far in 2018 we have completed the 2018 Winter Olympics & Paralympics Collection, which contains nearly 1,500 seeds and is 1.2TB of data. The collection covered 35 countries in 21 Languages. The nominations came from a mix of IIPC members and a public nomination form that was available through previous blog posts. For more information on this collection see lead curator, Helena Byrne’s blog posts.
In addition, we updated the National Olympic & Paralympic Committees collection with committees that were missing from the crawl in 2016. This collection was crawled again during the 2018 Winter Olympics & Paralympics. Not all National Committees have a website, but if you notice we are missing any websites get in touch (2018-winter-olympics [at] iipc.simplelists .com).
We are now turning our attention to resuming the World War I Commemoration and the ‘Online News around the World’ collections.
The World War I Commemoration project led by Peter Stirling, BnF, started in October 2015. It already includes over 2,000 seeds and covers a wide variety of different websites from official commemorations to amateur history websites, and the reporting of the centenary in the media. Websites from several different countries and many languages have been selected by the members’ of the IIPC. 2018 is an important year for this collection as we will be looking to capture activity leading up to and during the centenary of the armistice in November.
The ‘Online News around the World’ collection has been several years in planning, led by, Sabine Schostag, the Royal Danish Library, and will begin in earnest shortly. This ambitious project aims to document a selection of online news websites from as many countries as possible in the world during one week of the year (likely to be in November 2018). Once the metadata has been finalised, we will post details of how to nominate content for this collection. The IIPC has members in over 34 countries around the world which is already a good starting point but we hope to canvas much more widely than this to achieve our goal of global coverage!
This summer we will also be running new crawls of the seeds in the International Cooperation Organizations collection, led by Alex Thurman from Columbia University Libraries, which consists of all known active websites in the .int top-level domain (available only to organizations created by treaties). This collection was started in 2016 and includes important agencies in areas that require international cooperation, like environmental protection, economic development, and telecommunication.
In the meantime, we hope to see as many CDG members as possible for our session at the IIPC General Assembly on 12th November – more details to follow shortly.