No Time Like the Present: Methods for Generating Colourful and Factual Multilingual News Headlines

News headlines are the main method for briefly providing a summary of the news article and attracting an audience. In this paper, we experiment with different existing methods for producing colourful expressions and news headlines computationally, in a practical setting. Our case study is conducted by modifying an automated journalism system that generates multilingual news in three languages, namely English, Finnish and Swedish. We adapt existing methods for creative headlines and figurative language generation into the headline generation process of the system, modifying them to work in a multilingual setting. We conduct our evaluation by asking online judges to assess the original titles produced by the unmodified system and those enhanced by the methods described in this paper. The results of the evaluation suggest that the presented methods increase the creativity of existing headlines while maintaining their

[1]  Jun'ichi Tsujii,et al.  Trimming CFG Parse Trees for Sentence Compression Using Machine Learning Approaches , 2006, ACL.

[2]  Tony Veale,et al.  Creating Similarity: Lateral Thinking for Vertical Similarity Judgments , 2013, ACL.

[3]  Michele Banko,et al.  Headline Generation Based on Statistical Translation , 2000, ACL.

[4]  Enrique Alfonseca,et al.  HEADY: News headline abstraction through event pattern clustering , 2013, ACL.

[5]  R. Schwartz,et al.  Automatic Headline Generation for Newspaper Stories , 2002 .

[6]  Hannu Toivonen,et al.  Expanding and Weighting Stereotypical Properties of Human Characters for Linguistic Creativity , 2017, ICCC.

[7]  Fabrizio Silvestri,et al.  HEADS: Headline Generation as Sequence Prediction Using an Abstract Feature-Rich Space , 2015, NAACL.

[8]  Stephen Wan,et al.  Using Thematic Information in Statistical Headline Generation , 2003, Proceedings of the ACL 2003 workshop on Multilingual summarization and question answering -.

[9]  Daniel Marcu,et al.  Summarization beyond sentence extraction: A probabilistic approach to sentence compression , 2002, Artif. Intell..

[10]  Zhiyuan Liu,et al.  Neural Headline Generation with Minimum Risk Training , 2016, ArXiv.

[11]  Hannu Toivonen,et al.  Data-Driven News Generation for Automated Journalism , 2017, INLG.

[12]  Hervé Jégou,et al.  Loss in Translation: Learning Bilingual Word Mapping with a Retrieval Criterion , 2018, EMNLP.

[13]  Noah A. Smith,et al.  Summarization with a Joint Model for Sentence Extraction and Compression , 2009, ILP 2009.

[14]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[15]  Walter Daelemans,et al.  Pattern for Python , 2012, J. Mach. Learn. Res..

[16]  Richard M. Schwartz,et al.  Hedge Trimmer: A Parse-and-Trim Approach to Headline Generation , 2003, HLT-NAACL 2003.

[17]  Katja Filippova,et al.  Multi-Sentence Compression: Finding Shortest Paths in Word Graphs , 2010, COLING.

[18]  Emiel Krahmer,et al.  Clustering and Matching Headlines for Automatic Paraphrase Acquisition , 2009, ENLG.

[19]  John Dunnion,et al.  Machine Learning Approach to Augmenting News Headline Generation , 2005, IJCNLP.

[20]  Treebank Penn,et al.  Linguistic Data Consortium , 1999 .

[21]  Caj Södergård,et al.  No Landslide for the Human Journalist - An Empirical Study of Computer-Generated Election News in Finland , 2018, IEEE Access.

[22]  Mika Hämäläinen,et al.  UralicNLP: An NLP Library for Uralic Languages , 2019, J. Open Source Softw..

[23]  Hiroya Takamura,et al.  Subtree Extractive Summarization via Submodular Maximization , 2013, ACL.

[24]  Emiel Krahmer,et al.  Paraphrasing Headlines by Machine Translation Sentential Paraphrase Acquisition and Generation using Google News , 2011 .

[25]  G. Lynch Every Word You Set : Simulating the cognitive process of linguistic creativity with the PUNdit system , 2015 .

[26]  Gözde Özbal,et al.  Slogans Are Not Forever: Adapting Linguistic Expressions to the News , 2015, IJCAI.