No Landslide for the Human Journalist - An Empirical Study of Computer-Generated Election News in Finland

In an age of struggling news media, automated generation of news via natural language generation (NLG) methods could be of great help, especially in areas where the amount of raw input data is big, and the structure of the data is known in advance. One such news automation system is the Valtteri NLG system, which generates news articles about the Finnish municipal elections of 2017. To evaluate the quality of Valtteri-produced articles and to identify aspects to improve, $n=152$ users were asked to evaluate the output of Valtteri. Each evaluator rated six preselected computer-generated articles, four control articles written by journalists, and four computer-generated articles of their own choice. All the articles were evaluated along four dimensions: credibility, liking, quality, and representativeness. As expected, the texts written by Valtteri received lower ratings than those written by journalists, but overall the ratings were satisfactory (average 2.9 versus 4.0 for journalists on a five-point scale). Valtteri’s best rating (3.6) was for credibility. The computer-written articles that the evaluators could freely select got slightly better ratings than the preselected computer-written articles. When looking at the results by demographic groups, males aged 55 or more liked the automatic articles best and females aged 34 or less liked them the least. Evaluators mistook 21% of the computer-written articles as written by humans and 10% of the human-written articles as computer-written. The share of users making these mistakes grew with the age. Overall, the male evaluators made less writer-identification mistakes than female evaluators did.

[1]  A. Daly,et al.  Use of the logit scaling approach to test for rank-order and fatigue effects in stated preference data , 1994 .

[2]  Alison A. Plessinger,et al.  Exploring Receivers' Criteria for Perception of Print and Online News , 1999 .

[3]  Robert Dale,et al.  Building Natural Language Generation Systems (Studies in Natural Language Processing) , 2006 .

[4]  R. Armstrong The Long Tail: Why the Future of Business Is Selling Less of More , 2008 .

[5]  J. Ladenburg,et al.  Gender-specific starting point bias in choice experiments: Evidence from an empirical study , 2008 .

[6]  M. Castells,et al.  The Future of Journalism: Networked Journalism , 2012 .

[7]  Arjen van Dalen,et al.  The algorithms behind the headlines. How machine-written news redefines the core skills of human journalists , 2012 .

[8]  E. Krahmer,et al.  Journalist versus news consumer : The perceived credibility of machine written news , 2014 .

[9]  Christer Clerwall Enter the Robot Journalist , 2014 .

[10]  Jaemin Jung,et al.  Intrusion of software robots into journalism: The public's and journalists' perceptions of news written by algorithms and human journalists , 2017, Computers in Human Behavior.

[11]  Hannu Toivonen,et al.  Finding and expressing news from structured data , 2017, MindTrek.

[12]  Hannu Toivonen,et al.  Data-Driven News Generation for Automated Journalism , 2017, INLG.

[13]  C. Linden,et al.  Decades of Automation in the Newsroom , 2017 .

[14]  A. Graefe,et al.  Readers’ perception of computer-generated news: Credibility, expertise, and readability , 2018 .