SummScreen: A Dataset for Abstractive Screenplay Summarization

We introduce SUMMSCREEN, a summarization dataset comprised of pairs of TV series transcripts and human written recaps. The dataset provides a challenging testbed for abstractive summarization for several reasons. Plot details are often expressed indirectly in character dialogues and may be scattered across the entirety of the transcript. These details must be found and integrated to form the succinct plot descriptions in the recaps. Also, TV scripts contain content that does not directly pertain to the central plot but rather serves to develop characters or provide comic relief. This information is rarely contained in recaps. Since characters are fundamental to TV series, we also propose two entity-centric evaluation metrics. Empirically, we characterize the dataset by evaluating several methods, including neural models and those based on nearest neighbors. An oracle extractive approach outperforms all benchmarked models according to automatic metrics, showing that the neural models are unable to fully exploit the input transcripts. Human evaluation and qualitative analysis reveal that our nonoracle models are competitive with their oracle counterparts in terms of generating faithful plot events and can benefit from better content selectors. Both oracle and non-oracle models generate unfaithful facts, suggesting future research directions.1

[1]  Matthias Hagen,et al.  Abstractive Snippet Generation , 2020, WWW.

[2]  Jeffrey P. Bigham,et al.  Generating SOAP Notes from Doctor-Patient Conversations , 2020, ArXiv.

[3]  Arman Cohan,et al.  Longformer: The Long-Document Transformer , 2020, ArXiv.

[4]  Jinho D. Choi,et al.  Emotion Detection on TV Show Transcripts with Sequence-based Convolutional Neural Networks , 2017, AAAI Workshops.

[5]  悠太 菊池,et al.  大規模要約資源としてのNew York Times Annotated Corpus , 2015 .

[6]  Chenguang Zhu,et al.  MediaSum: A Large-scale Media Interview Dataset for Dialogue Summarization , 2021, NAACL.

[7]  K. McKeown,et al.  Exploring Content Selection in Summarization of Novel Chapters , 2020, ACL.

[8]  Franck Dernoncourt,et al.  A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents , 2018, NAACL.

[9]  Lun-Wei Ku,et al.  SocialNLP 2018 EmotionX Challenge Overview: Recognizing Emotions in Dialogues , 2018, SocialNLP@ACL.

[10]  Jinho D. Choi,et al.  FriendsQA: Open-Domain Question Answering on TV Show Transcripts , 2019, SIGdial.

[11]  Lukasz Kaiser,et al.  Generating Wikipedia by Summarizing Long Sequences , 2018, ICLR.

[12]  Dragomir R. Radev,et al.  QMSum: A New Benchmark for Query-based Multi-domain Meeting Summarization , 2021, NAACL.

[13]  Jinho D. Choi,et al.  Text-based Speaker Identification on Multiparty Dialogues Using Multi-document Convolutional Neural Networks , 2017, ACL.

[14]  Mirella Lapata,et al.  Movie Script Summarization as Graph-based Scene Extraction , 2015, NAACL.

[15]  Pushpak Bhattacharyya,et al.  Harnessing Sequence Labeling for Sarcasm Detection in Dialogue from TV Series ‘Friends’ , 2016, CoNLL.

[16]  Xavier Amatriain,et al.  Dr. Summarize: Global Summarization of Medical Dialogue by Exploiting Local Structures. , 2020, FINDINGS.

[17]  Rada Mihalcea,et al.  Explorations in Automatic Book Summarization , 2007, EMNLP.

[18]  Jinho D. Choi,et al.  Challenging Reading Comprehension on Daily Conversation: Passage Completion on Multiparty Dialog , 2018, NAACL.

[19]  Alexander M. Rush,et al.  Bottom-Up Abstractive Summarization , 2018, EMNLP.

[20]  Mirella Lapata,et al.  Screenplay Summarization Using Latent Narrative Structure , 2020, ACL.

[21]  Shuyang Cao,et al.  Efficient Attentions for Long Document Summarization , 2021, NAACL.

[22]  Mor Naaman,et al.  Newsroom: A Dataset of 1.3 Million Summaries with Diverse Extractive Strategies , 2018, NAACL.

[23]  Jinho D. Choi,et al.  They Exist! Introducing Plural Mentions to Coreference Resolution and Entity Linking , 2018, COLING.

[24]  Dragomir R. Radev,et al.  Multi-News: A Large-Scale Multi-Document Summarization Dataset and Abstractive Hierarchical Model , 2019, ACL.

[25]  Phil Blunsom,et al.  Teaching Machines to Read and Comprehend , 2015, NIPS.

[26]  Christopher D. Manning,et al.  Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[27]  Mirella Lapata,et al.  Don’t Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization , 2018, EMNLP.

[28]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[29]  Stephen E. Robertson,et al.  GatfordCentre for Interactive Systems ResearchDepartment of Information , 1996 .

[30]  Claire Cardie,et al.  Dialogue-Based Relation Extraction , 2020, ACL.

[31]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[32]  Jean Carletta,et al.  The AMI Meeting Corpus: A Pre-announcement , 2005, MLMI.

[33]  Lu Wang,et al.  Neural Network-Based Abstract Generation for Opinions and Arguments , 2016, NAACL.

[34]  Yu-Hsin Chen,et al.  Character Identification on Multiparty Conversation: Identifying Mentions of Characters in TV Shows , 2016, SIGDIAL Conference.

[35]  Richard Socher,et al.  A Deep Reinforced Model for Abstractive Summarization , 2017, ICLR.

[36]  Rico Sennrich,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[37]  Aleksander Wawer,et al.  SAMSum Corpus: A Human-annotated Dialogue Dataset for Abstractive Summarization , 2019, EMNLP.

[38]  Lu Wang,et al.  BIGPATENT: A Large-Scale Dataset for Abstractive and Coherent Summarization , 2019, ACL.

[39]  Andreas Stolcke,et al.  The ICSI Meeting Corpus , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[40]  Kevin Gimpel,et al.  Gaussian Error Linear Units (GELUs) , 2016 .

[41]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[42]  Jason Weston,et al.  A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.

[43]  Jinho D. Choi,et al.  Robust Coreference Resolution and Entity Linking on Dialogues: Character Identification on TV Show Transcripts , 2017, CoNLL.

[44]  Omer Levy,et al.  BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.

[45]  Jinho D. Choi,et al.  SemEval 2018 Task 4: Character Identification on Multiparty Dialogues , 2018, *SEMEVAL.

[46]  Mirella Lapata,et al.  Whodunnit? Crime Drama as a Case for Natural Language Understanding , 2018, Transactions of the Association for Computational Linguistics.

[47]  Benno Stein,et al.  TL;DR: Mining Reddit to Learn Automatic Summarization , 2017, NFiS@EMNLP.