Journalistic Source Discovery: Supporting The Identification of News Sources in User Generated Content

Many journalists and newsrooms now incorporate audience contributions in their sourcing practices by leveraging user-generated content (UGC). However, their sourcing needs and practices as they seek information from UGCs are still not deeply understood by researchers or well-supported in tools. This paper first reports the results of a qualitative interview study with nine professional journalists about their UGC sourcing practices, detailing what journalists typically look for in UGCs and elaborating on two UGC sourcing approaches: deep reporting and wide reporting. These findings then inform a human-centered design approach to prototype a UGC sourcing tool for journalists, which enables journalists to interactively filter and rank UGCs based on users’ example content. We evaluate the prototype with nine professional journalists who source UGCs in their daily routines to understand how UGC sourcing practices are enabled and transformed, while also uncovering opportunities for future research and design to support journalistic sourcing practices and sensemaking processes.

[1]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[2]  Mor Naaman,et al.  Finding and assessing social media information sources in the context of journalism , 2012, CHI.

[3]  Nicholas Diakopoulos,et al.  Computational News Discovery: Towards Design Considerations for Editorial Orientation Algorithms in Journalism , 2020, Digital Journalism.

[4]  Xiaomo Liu,et al.  Reuters tracer: Toward automated news production using large scale social media data , 2017, 2017 IEEE International Conference on Big Data (Big Data).

[5]  E. Katz The Two-Step Flow of Communication: An Up-To-Date Report on an Hypothesis , 1957 .

[6]  Ben Shneiderman,et al.  Strategies for evaluating information visualization tools: multi-dimensional in-depth long-term case studies , 2006, BELIV '06.

[7]  Panagiotis Takis Metaxas,et al.  Using TwitterTrails.com to Investigate Rumor Propagation , 2015, CSCW Companion.

[8]  Zhoujun Li,et al.  Read, Attend and Comment: A Deep Architecture for Automatic News Comment Generation , 2019, EMNLP.

[9]  Brian L. Massey Civic Journalism and Nonelite Sourcing: Making Routine Newswork of Community Connectedness , 1998 .

[10]  Niklas Elmqvist,et al.  ConceptVector: Text Visual Analytics via Interactive Lexicon Building Using Word Embedding , 2018, IEEE Transactions on Visualization and Computer Graphics.

[11]  Briar Smith,et al.  Deciphering User-Generated Content in Transitional Societies: A Syria Coverage Case Study , 2012 .

[12]  A. Strauss,et al.  The discovery of grounded theory: strategies for qualitative research aldine de gruyter , 1968 .

[13]  Nicholas Diakopoulos,et al.  Accountability in algorithmic decision making , 2016, Commun. ACM.

[14]  N. Diakopoulos,et al.  Negotiated Autonomy: The Role of Social Media Algorithms in Editorial Decision Making , 2020, Media and Communication.

[15]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[16]  Colin Porlezza,et al.  AI should embody our values: Investigating journalistic values to inform AI technology design , 2020, NordiCHI.

[17]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[18]  Neil Thurman,et al.  LIVE BLOGGING–DIGITAL JOURNALISM’S PIVOTAL PLATFORM? , 2013 .

[19]  Haiyi Zhu,et al.  [Un]breaking News: Design Opportunities for Enhancing Collaboration in Scientific Media Production , 2018, CHI.

[20]  Mor Naaman,et al.  Editorial Algorithms: Using Social Media to Discover and Report Local News , 2015, ICWSM.

[21]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[22]  Mimi Onuoha,et al.  Guide to Crowdsourcing , 2015 .

[23]  ProPublica Charles. Ornstein,et al.  RevEx : Visual Investigative Journalism with A Million Healthcare Reviews , 2015 .

[24]  P. Resnick,et al.  RumorLens: A System for Analyzing the Impact of Rumors and Corrections in Social Media , 2014 .

[25]  Lee Wilkins Deciding What's News: A Study of CBS Evening News, NBC Nightly News, Newsweek, and Time , 2005 .

[26]  Derek Greene,et al.  Unsupervised graph-based topic labelling using dbpedia , 2013, WSDM.

[27]  Sanne Kruikemeier,et al.  Re-evaluating journalistic routines in a digital age: A review of research on the use of online sources , 2016, New Media Soc..

[28]  E. Thorsen,et al.  Seven Characteristics Defining Online News Formats , 2018, Digital Journalism.

[29]  P. Pirolli,et al.  The Sensemaking Process and Leverage Points for Analyst Technology as Identified Through Cognitive Task Analysis , 2015 .

[30]  Kate Starbird,et al.  Journalists as Crowdsourcerers: Responding to Crisis by Reporting with a Crowd , 2014, Computer Supported Cooperative Work (CSCW).

[31]  Niklas Elmqvist,et al.  Supporting Comment Moderators in Identifying High Quality Online News Comments , 2016, CHI.

[32]  Arkaitz Zubiaga,et al.  Supporting the Use of User Generated Content in Journalistic Practice , 2017, CHI.

[33]  Ben Shneiderman,et al.  Designing the User Interface: Strategies for Effective Human-Computer Interaction , 1998 .

[34]  Patrick Olivier,et al.  Finding "real people": trust and diversity in the interface between professional and citizen journalists , 2014, CHI.

[35]  M. Sheelagh T. Carpendale,et al.  Analyzing Qualitative Data , 2017, ISS.

[36]  Kalina Bontcheva,et al.  Spatio-temporal grounding of claims made on the web , 2014 .

[37]  Anselm L. Strauss,et al.  Basics of qualitative research : techniques and procedures for developing grounded theory , 1998 .

[38]  Jacob Thebault-Spieker,et al.  GroundTruth: Augmenting Expert Image Geolocation with Crowdsourcing and Shared Representations , 2019, Proc. ACM Hum. Comput. Interact..

[39]  Mor Naaman,et al.  Diamonds in the rough: Social media visual analytics for journalistic inquiry , 2010, 2010 IEEE Symposium on Visual Analytics Science and Technology.

[40]  Aljosha Karim Schapals,et al.  Live blogs, sources, and objectivity: The contradictions of real-time online reporting , 2016 .

[41]  M. Broersma,et al.  Social Media as Beat : Tweets as a news source during the 2010 British and Dutch elections , 2013 .

[42]  Naeemul Hassan,et al.  A Large-scale Study of Social Media Sources in News Articles , 2018, ArXiv.

[43]  Andreas L. Opdahl,et al.  Analysis and Design of Computational News Angles , 2020, IEEE Access.

[44]  Tim P. Vos,et al.  How Gatekeeping Still Matters: Understanding Media Effects in an Era of Curated Flows , 2015 .

[45]  Alfred Hermida,et al.  Sourcing the Arab Spring: A Case Study of Andy Carvin's Sources on Twitter During the Tunisian and Egyptian Revolutions , 2014, J. Comput. Mediat. Commun..

[46]  Adam Tauman Kalai,et al.  Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings , 2016, NIPS.

[47]  T. Poell,et al.  UvA-DARE (Digital Academic Repository) Twitter, YouTube, and Flickr as platforms of alternative journalism: the social media account of the 2010 Toronto G20 protests , 2011 .

[48]  Nicholas Diakopoulos,et al.  Automating the News , 2019 .

[49]  Arkaitz Zubiaga,et al.  Mining social media for newsgathering: A review , 2018, Online Soc. Networks Media.

[50]  Heli Väätäjä,et al.  Crowdsourced news reporting: supporting news content creation with mobile phones , 2011, Mobile HCI.

[51]  A. Hermida #JOURNALISM: Reconfiguring journalism research about Twitter, one tweet at a time , 2013 .

[52]  John M. Carroll,et al.  Making Use: Scenario-Based Design of Human-Computer Interactions , 2000 .

[53]  Dragomir R. Radev,et al.  Centroid-based summarization of multiple documents , 2004, Inf. Process. Manag..

[54]  Scott Wright,et al.  A Tale of Two Stories from “Below the Line” , 2015 .

[55]  N. Fielding,et al.  Mediating the Message , 1990 .

[56]  David Carmel,et al.  Enhancing cluster labeling using wikipedia , 2009, SIGIR.

[57]  Måns Magnusson Finding the news lead in the data haystack : Automated local data journalism using crime data , 2016 .

[58]  T. Murata,et al.  Breaking News Detection and Tracking in Twitter , 2010, 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[59]  Neil A. M. Maiden,et al.  Making the News: Digital Creativity Support for Journalists , 2018, CHI.

[60]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[61]  Neil Thurman,et al.  Social Media, Surveillance, and News Work , 2017 .

[62]  Charles N. Davis,et al.  Principles of American Journalism: An Introduction , 2013 .

[63]  Tanja Aitamurto Crowdsourcing as a Knowledge-Search Method in Digital Journalism , 2016 .

[64]  Claire Wardle,et al.  Beyond user-generated content: a production study examining the ways in which UGC is used at the BBC , 2010 .

[65]  M. de Rijke,et al.  Short Text Similarity with Word Embeddings , 2015, CIKM.