Journalism History, Web Archives, and New Methods for Understanding the Evolution of Digital Journalism

Archived webpages are a critical source of data for understanding the current state of the news media industry, as well as how the industry has changed over time. Dramatic changes in the news media industry in recent decades have occurred in tandem with the evolution on the Web. Archived webpages are valuable records for understanding and analyzing how newspaper companies have adapted to technological changes such as social media feeds and sharing of news content via Twitter. This article outlines a methodological approach to utilizing Web archives as a means of examining change in the news media industry. Researchers have developed new tools to improve researcher access to archived Web data in order to advance studies of the Web, and to enable the tracking of changes in news media as they emerge over time. A case study examining local news in the United States is used to illustrate the methodological challenges and promise of working with these data, highlighting the power and potential of Web archives for journalism research. Finally, the closing sections discuss challenges associated with the scale and scope of archived Web data and point to new areas for future research.

[1]  David Deacon Yesterday’s Papers and Today’s Technology , 2007 .

[2]  Anat Ben-David,et al.  What does the Web remember of its deleted past? An archival reconstruction of the former Yugoslav top-level domain , 2016, New Media Soc..

[3]  Miguel Costa,et al.  The evolution of web archiving , 2017, International Journal on Digital Libraries.

[4]  Daniela B Friedman,et al.  Health on the Web: An examination of health content and mobilising information on local television Websites , 2011, Informatics for health & social care.

[5]  Meghan Dougherty,et al.  Community, tools, and practices in web archiving: The state‐of‐the‐art in relation to social science and humanities research needs , 2014, J. Assoc. Inf. Sci. Technol..

[6]  Kathleen A. Hansen,et al.  Newspaper archives reveal major gaps in digital age , 2015 .

[7]  Ian Milligan Lost in the Infinite Archive: The Promise and Pitfalls of Web Archives , 2016, Int. J. Humanit. Arts Comput..

[8]  David Tewksbury The Seeds of Audience Fragmentation: Specialization in the Use of Online News Sites , 2005 .

[9]  Philip M. Napoli,et al.  Local Journalism and the Information Needs of Local Communities , 2017 .

[10]  Matthew N. Beckmann,et al.  Where You Live and What You Watch: The Impact of Racial Proximity and Local Television News on Attitudes about Race and Crime , 2002 .

[11]  Ralph Schroeder,et al.  The Web as History , 2017 .

[12]  Matthew S. Weber,et al.  Observing the web by understanding the past: archival internet research , 2014, WWW.

[13]  Adrian Bingham,et al.  ‘The Digitization of Newspaper Archives: Opportunities and Challenges for Historians’ , 2010 .

[14]  M. Jacomy,et al.  ForceAtlas2, a Continuous Graph Layout Algorithm for Handy Network Visualization Designed for the Gephi Software , 2014, PloS one.

[15]  T MeyerEric,et al.  Community, tools, and practices in web archiving , 2014 .

[16]  Niels Brügger,et al.  Web historiography and Internet Studies: Challenges and perspectives , 2013, New Media Soc..

[17]  Michael Zimmer The Twitter Archive at the Library of Congress: Challenges for information practice and information policy , 2015, First Monday.

[18]  R. Rosenzweig Scarcity or Abundance? Preserving the Past in a Digital Era , 2003 .

[19]  Stephen Lacy,et al.  The Effectiveness of Random, Consecutive Day and Constructed Week Sampling in Newspaper Content Analysis , 1993 .

[20]  O. Westlund The Production and Consumption of News in An Age of Mobile Media , 2014 .

[21]  Kim Christian Schrøder,et al.  The Relative Importance of Social Media for Accessing, Finding, and Engaging with News , 2014 .

[22]  M. McCombs,et al.  What the Public Expects of Local News: Views on Public and Traditional Journalism , 2005 .

[23]  M. S. Weber,et al.  The Flow of Digital News in a Network of Sources, Authorities, and Hubs , 2011 .

[24]  Christos Faloutsos,et al.  Graph evolution: Densification and shrinking diameters , 2006, TKDD.

[25]  Jinfang Niu,et al.  An Overview of Web Archiving , 2012, D Lib Mag..

[26]  Kathleen M. Carley,et al.  Network text analysis of conceptual overlap in interviews, newspaper articles and keywords , 2013, Social Network Analysis and Mining.

[27]  Julien Masanés Web Archiving: Issues and Methods , 2006 .

[28]  Neil Thurman,et al.  THE FUTURE OF PERSONALIZATION AT NEWS WEBSITES , 2012 .

[29]  Michael Neubert,et al.  Using RSS to Improve Web Harvest Results for News Web Sites , 2017 .

[30]  Radio Reverb: The Impact of “Local” News Reimported to Its Own Community , 2007 .

[31]  Bob Nicholson,et al.  THE DIGITAL TURN , 2013 .

[32]  Ralph Schroeder,et al.  The Web as History : Using Web Archives to Understand the Past and the Present , 2017 .

[33]  Miguel Costa,et al.  The Importance of Web Archives for Humanities , 2014, Int. J. Humanit. Arts Comput..

[34]  Christopher Ali,et al.  Local News in a Digital World: Small-Market Newspapers in the Digital Age , 2017 .

[35]  F. Zarndt,et al.  Missing links: The digital news preservation discontinuity , 2014 .

[36]  Lee Ahern,et al.  The Effectiveness of Stratified Constructed Week Sampling for Content Analysis of Electronic News Source Archives: AP Newswire, Business Wire, and PR Newswire , 2009 .

[37]  Matthew S. Weber,et al.  Newspapers and the Long-Term Implications of Hyperlinking , 2012, J. Comput. Mediat. Commun..

[38]  Peter R. Monge,et al.  Industries in Turmoil , 2017, Commun. Res..

[39]  Lawrence Lessig,et al.  Free Culture: How Big Media Uses Technology and the Law to Lock Down Culture and Control Creativity , 2004 .

[40]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[41]  M. Reason,et al.  Approaches to the newspaper archive: content analysis and press coverage of Glasgow’s Year of Culture , 2007 .

[42]  Andreas Widholm,et al.  Tracing Online News in Motion , 2016, Rethinking Research Methods in an Age of Digital Journalism.

[43]  D. Boyd,et al.  CRITICAL QUESTIONS FOR BIG DATA , 2012 .

[44]  Soomin Seo - DRAFT-Review of the Literature Regarding Critical Information Needs of the American Public , 2012 .

[45]  Donald H. Gaff Extra! Extra! Read All About It: Newspaper Archives as Archaeological Site Survey , 2017 .

[46]  Dan Roth,et al.  Design Challenges and Misconceptions in Named Entity Recognition , 2009, CoNLL.

[47]  Meredith Broussard,et al.  Challenges of archiving and preserving born-digital news applications , 2017 .