Finding and expressing news from structured data

In the age of increasing floods of information, finding the news signals from the noise has become increasingly resource and time intensive for journalists. Generally, news media companies have the important role of filtering and explaining this flood of information to the public. However, with the increase in availability of data sources, human journalists are unable to catch and report on all the news. This limitation, coupled with the need for media companies to continuously provide value to news readers, calls for automated solutions, such as automatically generating news from data. In order to support the journalists and media companies, and to provide value to audiences, this work proposes approaches for automatically finding news or newsworthy events from structured data using statistical analysis. Utilizing a real natural language news generation system as a case study, we demonstrate the feasibility and benefits of automating those processes. In particular, the paper reveals that through automation of the news generation process, including the generation of textual news articles, a large amount of news can be expressed in digestible formats to audiences, at varying local levels, and in multiple languages. In addition, automation allows the audience to tailor or personalize the news they want to read. Results of this work thus support and broaden the news offering and experiences for both media companies and the public.

[1]  T. Harcup,et al.  News Values and Selectivity , 2008 .

[2]  Lukás Burget,et al.  Recurrent neural network based language model , 2010, INTERSPEECH.

[3]  Emiel Krahmer,et al.  From data to speech: a general approach , 2001, Natural Language Engineering.

[4]  R. Logie,et al.  When a graph is poorer than 100 words: A comparison of computerised natural language generation, human generated descriptions and graphical displays in neonatal intensive care , 2010 .

[5]  C. Linden,et al.  Decades of Automation in the Newsroom , 2017 .

[6]  Ehud Reiter,et al.  SumTime-Mousam: Configurable marine weather forecast generator , 2003 .

[7]  Dan Berkowitz,et al.  Reporters and Their Sources , 2008, The Handbook of Journalism Studies.

[8]  T. Harcup,et al.  What is News? , 2017, The Universal Journalist.

[9]  Chris Mellish,et al.  Choosing the content of textual summaries of large time-series data sets , 2006, Natural Language Engineering.

[10]  George Sylvie The Elements of Journalism: What Newspeople Should Know and the Public Should Expect , 2001 .

[11]  Hannu Toivonen,et al.  Data-Driven News Generation for Automated Journalism , 2017, INLG.

[12]  Simon Cottle,et al.  From BBC Newsroom to BBC Newscentre : On Changing Technology and Journalist Practices , 1999 .

[13]  Ehud Reiter,et al.  Book Reviews: Building Natural Language Generation Systems , 2000, CL.

[14]  Victoria J. Hodge,et al.  A Survey of Outlier Detection Methodologies , 2004, Artificial Intelligence Review.

[15]  Shyi-Ming Chen,et al.  Automatically generating the weather news summary based on fuzzy reasoning and ontology techniques , 2014, Inf. Sci..

[16]  Richard I. Kittredge,et al.  Using natural-language processing to produce weather forecasts , 1994, IEEE Expert.

[17]  Simon Mille,et al.  Perspective-oriented generation of football match summaries: Old tasks, new challenges , 2012, TSLP.

[18]  L. Becker,et al.  News Organizations and Routines , 2008 .

[19]  Gaye Tuchman Making News: A Study in the Construction of Reality , 1978 .

[20]  H. Molotch,et al.  NEWS AS PURPOSIVE BEHAVIOR: ON THE STRATEGIC USE OF ROUTINE EVENTS, ACCIDENTS, AND SCANDALS* , 1974 .

[21]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[22]  John V. Pavlik,et al.  The Impact of Technology on Journalism , 2000 .

[23]  Gracián Triviño,et al.  Combining Semantic Web technologies and Computational Theory of Perceptions for text generation in financial analysis , 2010, International Conference on Fuzzy Systems.

[24]  Charu C. Aggarwal,et al.  Outlier Detection for Temporal Data: A Survey , 2014, IEEE Transactions on Knowledge and Data Engineering.

[25]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[26]  Konstantin Lopyrev,et al.  Generating News Headlines with Recurrent Neural Networks , 2015, ArXiv.

[27]  A. Bruns The active audience : transforming journalism from gatekeeping to gatewatching , 2008 .

[28]  David Caswell,et al.  Automated Journalism 2.0: Event-driven narratives , 2018 .

[29]  Anna S. Law,et al.  A Comparison of Graphical and Textual Presentations of Time Series Data to Support Medical Decision Making in the Neonatal Intensive Care Unit , 2005, Journal of Clinical Monitoring and Computing.

[30]  Mark T. Maybury,et al.  Generating Summaries from Event Data , 1995, Inf. Process. Manag..

[31]  Philip J. Hayes,et al.  Automatic Extraction of Facts from Press Releases to Generate News Stories , 1992, ANLP.

[32]  Yejin Choi,et al.  Globally Coherent Text Generation with Neural Checklist Models , 2016, EMNLP.

[33]  Gaye Tuchman Objectivity as Strategic Ritual: An Examination of Newsmen's Notions of Objectivity , 1972, American Journal of Sociology.

[34]  Paul Piwek,et al.  Natural Language Generation , 2004, Lecture Notes in Computer Science.

[35]  Liubov Nesterenko,et al.  Building a System for Stock News Generation in Russian , 2016, WebNLG.