Social media data archives in an API-driven world

In this article, we explore the long-term preservation implications of application programming interfaces (APIs) which govern access to data extracted from social media platforms. We begin by introducing the preservation problems that arise when APIs are the primary way to extract data from platforms, and how tensions fit with existing models of archives and digital repository development. We then define a range of possible types of API users motivated to access social media data from platforms and consider how these users relate to principles of digital preservation. We discuss how platforms’ policies and terms of service govern the set of possibilities for access using these APIs and how the current access regime permits persistent problems for archivists who seek to provide access to collections of social media data. We conclude by surveying emerging models for access to social media data archives found in the USA, including community driven not-for-profit community archives, university research repositories, and early industry–academic partnerships with platforms. Given the important role these platforms occupy in capturing and reflecting our digital culture, we argue that archivists and memory workers should apply a platform perspective when confronting the rich problem space that social platforms and their APIs present for the possibilities of social media data archives, asserting their role as “developer stewards” in preserving culturally significant data from social media platforms.

[1]  Christopher A. Lee,et al.  Open Archival Information System (OAIS) Reference Model , 2010 .

[2]  Sue McKemmish,et al.  Somewhere beyond custody: literature review , 1994 .

[3]  Joanne Evans,et al.  The Australian Women’s Archives Project: Creating and Co-curating Community Feminist Archives in a Post-custodial Age , 2017 .

[4]  Michael Zimmer The Twitter Archive at the Library of Congress: Challenges for information practice and information policy , 2015, First Monday.

[5]  Tarleton Gillespie,et al.  The politics of ‘platforms’ , 2010, New Media Soc..

[6]  Jed R. Brubaker,et al.  Death, Memorialization, and Social Media: A Platform Perspective for Personal Archives , 2014 .

[7]  Axel Bruns,et al.  After the ‘APIcalypse’: social media platforms and their fight against critical scholarly research , 2019, Information, Communication & Society.

[8]  Adam Kriesberg,et al.  Tweets may be archived: Civic engagement, digital preservation and obama white house social media data , 2017, ASIST.

[9]  Martin Pilgram,et al.  Consultative Committee For Space Data Systems , 2009 .

[10]  Dominique Glassman Facebook is creating records — but who is managing them? , 2020, Archives and Manuscripts.

[11]  Linda Henry,et al.  Schellenberg in Cyberspace , 2009 .

[12]  Jeannette Allis Bastian Taking Custody, Giving Access: A Postcustodial Role for a New Century , 2002 .

[13]  Daniel Chudnov,et al.  API-based social media collecting as a form of web archiving , 2018, International Journal on Digital Libraries.

[14]  Christian Kelleher Archives Without Archives: (Re)Locating and (Re)Defining the Archive Through Post-Custodial Praxis , 2017 .

[15]  Anne Helmond,et al.  The Political Economy of Social Data: A Historical Analysis of Platform-Industry Partnerships , 2017, SMSociety.

[16]  John Garrett,et al.  Preserving Digital Information. Report of the Task Force on Archiving of Digital Information. , 1996 .

[17]  Deen Freelon Computational Research in the Post-API Age , 2018, Political Communication.

[18]  Arkaitz Zubiaga,et al.  A longitudinal assessment of the persistence of twitter datasets , 2017, J. Assoc. Inf. Sci. Technol..

[19]  Elisabeth Fondren,et al.  Archiving and Preserving Social Media at the Library of Congress: Institutional and Cultural Challenges to Build a Twitter Archive , 2018, Preservation, Digital Technology & Culture.

[20]  Elizabeth Yakel,et al.  Significant Properties as Contextual Metadata , 2011 .

[21]  David B. Nieborg,et al.  The platformization of cultural production: Theorizing the contingent cultural commodity , 2018, New Media Soc..

[22]  Clifford Lynch,et al.  Stewardship in the "Age of Algorithms" , 2017, First Monday.

[23]  Terry Cook Electronic records, paper minds: the revolution in information management and archives in the post/ custodial and post/ modernist era. [Based on a presentation delivered by the author during his November 1993 Australian tour.] , 1994 .

[24]  Asaf Nissenbaum,et al.  An agnotological analysis of APIs: or, disconnectivity and the ideological limits of our knowledge of social media , 2018, Inf. Soc..

[25]  Axel Bruns,et al.  Twitter data analytics - or: the pleasures and perils of studying Twitter , 2014, Aslib J. Inf. Manag..

[26]  Niels Brügger Webraries and Web Archives – The Web Between Public and Private , 2017 .

[27]  F. Ham Archival Strategies for the Post-Custodial Era , 2010 .

[28]  Sue McKemmish,et al.  Self-determination and archival autonomy: advocating activism , 2015 .

[29]  Roy Rosenzweig The Road to Xanadu: Public and Private Pathways on the History Web , 2001 .

[30]  Kevin Driscoll,et al.  Big Data, Big Questions| Working Within a Black Box: Transparency in the Collection and Production of Big Twitter Data , 2014 .

[31]  Amelia Acker,et al.  Data craft: a theory/methods package for critical internet studies , 2019, Information, Communication & Society.

[32]  Christian Sandvig,et al.  Digital Research Confidential: The Secrets of Studying Behavior Online , 2015 .

[33]  M. Ngoepe Archival orthodoxy of post-custodial realities for digital records in South Africa , 2017 .

[34]  Mark Zuckerberg From Facebook, answering privacy concerns with new settings , 2010 .

[35]  Greer Martin,et al.  Digital Preservation File Format Policies of ARL Member Libraries: An Analysis , 2014, D Lib Mag..

[36]  Paul Conway,et al.  Preservation in the Age of Google: Digitization, Digital Preservation, and Dilemmas1 , 2010, The Library Quarterly.

[37]  Axel Bruns,et al.  Faster than the speed of print: Reconciling 'big data' social media analysis and academic scholarship , 2013, First Monday.