By Nicola Bingham, Lead Curator, Web Archiving, the British Library; Co-chair, IIPC Content Development Working Group
On 4th October 2021 the Content Development Group (CDG) initiated a thematic website collection in response to recent developments in Afghanistan at the behest of several CDG members.
Recent events in Afghanistan have precipitated a humanitarian crisis which escalated markedly after foreign armed forces withdrew from the country in May 2021.1 As US and Allied troops retreated, the Taliban quickly gained ground, seizing cities across the country, increasing threats of a worsening civil war. The Taliban have now claimed control of all major cities in Afghanistan, including the capital Kabul, where fighters have seized the presidential palace, forcing the president to flee. The Afghan government which was supported by the US and the Allies has collapsed and there has been a transition of power to the Taliban.
As violence intensifies across large areas of the country, civilians are being caught up in the fighting and hundreds of Afghans have been killed in recent weeks, while thousands have been forced to flee their homes.
The humanitarian crisis is obviously of great concern internationally, however the cultural heritage of Afghanistan is also under threat. As described by Richard Ovenden in an article in the Financial Times (24th September 2021), the global Library and Archive community has been trying to do what it can, from concerted efforts to help Afghans working in the cultural heritage sector to leave the country, to supporting the preservation of cultural artefacts including digital materials.2
It is likely that the new regime will want to bring the Internet under greater censorship and control3 meaning web content and the information contained therein is at risk. Alongside the internal threat, is the risk that foreign internet service providers, largely based in the US, could turn off cloud servers and social media platforms etc., if America decided to act on the threat to impose sanctions on Afghanistan.4
Existing collecting efforts
Rapid response collecting of at risk Afghan Internet content has already been undertaken by several archiving institutions, alongside ongoing Afghanistan collections curated by the Library of Congress. Examples include:
- University of California Berkeley: At-Risk Afghanistan Web Archiving Project 2021
- Afghanistan News collected by Mark Graham, Internet Archive
- Library of Congress: Afghanistan Web Archive
- Arquivo.pt automated seed collection on Afghan websites & news content, 17th of August 20215
The CDG does not wish to duplicate these efforts but rather to complement them by focussing on the international aspects of events in Afghanistan, documenting transnational involvement and worldwide interest in the process of the change of regime, recording how the situation evolves over time.
With this in mind, the Afghanistan collection has been scoped so that it adheres to the broader content development policy of the CDG namely that the following criteria are adhered to;
- It is of high interest to IIPC members;
- It does not map to any one member’s responsibility or mandate;
- It is of higher value to research because it represents more perspectives than similar collections in only one member archive would do;
- It is transnational in scope, but not necessarily “global”.
The aim of any CDG collection is to reflect multiple viewpoints and to preserve a snapshot of society as it was at the time of archiving. It will be important to researchers that websites from across the spectrum of all human activity are collected in order to present a more accurate picture of the times.
Websites produced by the Taliban or IS, or that are pro-Taliban/IS, can be included in the collection. Most Government websites will begin to express pro-Taliban views in any case.
The Taliban/IS are likely to have used communication networks that cannot be archived for technical reasons, e.g. Facebook/WhatsApp and so this type of content will be excluded.
Sub-topics may include:
- Military experience in Afghanistan; nations withdrawing armed forces from Afghanistan; statements of defence and military analysis
- Analysis and policy of think tanks such as Chatham House (UK), Brookings Institute and RAND International Affairs (US), for example
- Afghan refugees in Pakistan, Iran and elsewhere
- International relief efforts (The Red Cross, United Nations etc.)
- Diaspora communities – Afghan people around the world
- Human rights/Women’s rights/LGBTQ+ rights
- Foreign embassies and diplomatic relations
- Sanctions imposed against Afghanistan by foreign powers
- Transnational websites and social media (SoundCloud, Squarespace, Twitter, WordPress, YouTube, Facebook Group pages (not individual Facebook profiles) about Afghanistan from any country and in any language.
The list is not exhaustive, and it is expected that contributors may wish to explore other sub-topics within their own areas of interest and expertise, providing they are within the general collection development scope.
The lead curator for this collection is Nicola Bingham. She will be responsible for developing the content strategy, overseeing the progression of the collection, and promoting the collection to potential users.
IIPC members together with a wide number of stakeholders in the Library and Archive community, including staff at the Bodleian Libraries, Oxford, as well as members of the public are expected to contribute to the collection (see below for details about how to contribute).
Crawls are being undertaken in Archive-It by Janko Klasinc (National and University Library, Slovenia) and Carlos Lelkes-Rarugal (Assistant Web Archivist, British Library).
Size of collection
The CDG’s full budget in 2021 is 4 TB, of which 1.8 TB has been used through the end of September. The CDG plan to undertake small crawls for our ongoing and new collections as follows;
- 2020 Summer Olympics and Paralympics [held in 2021]
- Novel Coronavirus (COVID-19)
- Intergovernmental Organizations
- National Olympic and Paralympic Committees.
At this stage, c. 400 GB of data has been allocated to the Afghanistan collection.
Nominations will be sought from IIPC Members and external agencies such as the UK Legal Deposit Libraries, University Libraries and the Library and Archive community. A Google form will be sent out to elicit nominations from non-IIPC members and members of the public. This form contains the relevant metadata fields which will populate a Google sheet. The aim of distributing the work of co-curation for the collection is to enable a diverse range of communities and individuals to contribute, including members of the Afghanistan community, helping to ensure that the collection is as representative as possible.
IIPC Members will be able to add their nominations directly to a Google sheet which will be reviewed by the lead curator against collection scope and marked for inclusion in the collection.
Access to the collection will be through the Internet Archives’ Archive-It interface. Metadata will be exposed as facets on the collection home page and will be browseable by users.
How to contribute:
- Please read the Collection Scoping Document. This goes into more detail about what is in and out of scope
- If you are an IIPC member, please nominate URLs and add basic metadata to this Google Sheet
- If you are not an IIPC member you may contribute nominations and a small amount of basic metadata on this Google form.
1 Kiely, E. and Farley, R. Timeline of U.S. Withdrawal from Afghanistan. August 17, 2021. FactCheck.org
2 Ovenden, R. The Battle for Afghanistan’s libraries. September, 24, 2021. Financial Times. https://www.ft.com/content/82fffcc8-3631-48dc-829d-44f237549a59
3 Afghanistan’s Internet: who has control of what? Goman Web. September 20, 2021. https://gomanweb.net/2021/09/20/afghanistans-internet-who-has-control-of-what/ Digital oppression in Afghanistan. NordVPN Blog. August 20, 2021. https://nordvpn.com/blog/digital-oppression-in-afghanistan
Baibhawi, R. Taliban Shuts Internet In Panjshir To Stop Northern Alliance From Galvanizing Support. August 29, 2021. Republic. https://www.republicworld.com/world-news/rest-of-the-world-news/taliban-shuts-internet-in-panjshir-to-stop-northern-alliance-from-galvanizing-support.html
Vavra, S. and Falzone, D. This Is Why the Taliban Keeps F*cking Up the Internet. September 16, 2021. Daily Beast. https://www.thedailybeast.com/this-is-why-the-taliban-keeps-fcking-up-afghanistans-internet
Sorkin, A. R., Karaian, J., Kessler, S., Gandel, S., Hirsch, L., Livni, E. and Schaverien, A. Big Tech and the Taliban. August, 19, 2021. The New York Times. https://www.nytimes.com/2021/08/19/business/dealbook/taliban-social-media.html