AnswerSumm: A Manually-Curated Dataset and Pipeline for Answer Summarization

Community Question Answering (CQA) fora such as Stack Overflow and Yahoo! Answers contain a rich resource of answers to a wide range of community-based questions. Each question thread can receive a large number of answers with different perspectives. One goal of answer summarization is to produce a summary that reflects the range of answer perspectives. A major obstacle for this task is the absence of a dataset to provide supervision for producing such summaries. Recent works propose heuristics to create such data, but these are often noisy and do not cover all answer perspectives present. This work introduces a novel dataset of 4,631 CQA threads for answer summarization curated by professional linguists. Our pipeline gathers annotations for all subtasks of answer summarization, including relevant answer sentence selection, grouping these sentences based on perspectives, summarizing each perspective, and producing an overall summary. We analyze and benchmark state-of-the-art models on these subtasks and introduce a novel unsupervised approach for multi-perspective data augmentation that boosts summarization performance according to automatic evaluation. Finally, we propose reinforcement learning rewards to improve factual consistency and answer coverage and analyze areas for improvement.

[1]  Dan Su,et al.  Improve Query Focused Abstractive Summarization by Incorporating Answer Relevance , 2021, FINDINGS.

[2]  Jianfeng Gao,et al.  Data Augmentation for Abstractive Query-Focused Multi-Document Summarization , 2021, AAAI.

[3]  Mirella Lapata,et al.  Generating Query Focused Summaries from Query-Free Resources , 2020, ACL.

[4]  Mirella Lapata,et al.  Coarse-to-Fine Query Focused Multi-Document Summarization , 2020, EMNLP.

[5]  Wei Zhang,et al.  Summarizing Chinese Medical Answer with Graph Convolution Networks and Question-focused Dual Attention , 2020, FINDINGS.

[6]  Shafiq R. Joty,et al.  Improving Zero and Few-Shot Abstractive Summarization with Intermediate Fine-tuning and Data Augmentation , 2020, NAACL.

[7]  Mohit Bansal,et al.  SuperPAL: Supervised Proposition ALignment for Multi-Document Summarization and Derivative Sub-Tasks , 2020, ArXiv.

[8]  Yaliang Li,et al.  Bridging Hierarchical and Sequential Context Modeling for Question-driven Extractive Answer Summarization , 2020, SIGIR.

[9]  John Glover,et al.  A Large-Scale Multi-Document Summarization Dataset from the Wikipedia Current Events Portal , 2020, ACL.

[10]  Tanmoy Chakraborty,et al.  Neural Abstractive Summarization with Structural Attention , 2020, IJCAI.

[11]  Norbert Zeh,et al.  Enhancement of Short Text Clustering by Iterative Classification , 2020, NLDB.

[12]  Jiawei Han,et al.  Generating Representative Headlines for News Stories , 2020, WWW.

[13]  Wai Lam,et al.  Joint Learning of Answer Selection and Answer Summary Generation in Community Question Answering , 2019, AAAI.

[14]  Omer Levy,et al.  BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.

[15]  Colin Raffel,et al.  Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[16]  Eduard Hovy,et al.  Earlier Isn’t Always Better: Sub-aspect Analysis on Corpus and System Biases in Summarization , 2019, EMNLP.

[17]  Thomas Demeester,et al.  A Self-Training Approach for Short Text Clustering , 2019, RepL4NLP@ACL.

[18]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[19]  Min Yang,et al.  A Multi-Task Learning Framework for Abstractive Text Summarization , 2019, AAAI.

[20]  Jason Weston,et al.  ELI5: Long Form Question Answering , 2019, ACL.

[21]  Dragomir R. Radev,et al.  Multi-News: A Large-Scale Multi-Document Summarization Dataset and Abstractive Hierarchical Model , 2019, ACL.

[22]  Ido Dagan,et al.  Ranking Generated Summaries by Correctness: An Interesting but Challenging Application for Natural Language Inference , 2019, ACL.

[23]  W. Bruce Croft,et al.  ANTIQUE: A Non-factoid Question Answering Benchmark , 2019, ECIR.

[24]  Tanmoy Chakraborty,et al.  CQASUMM: Building References for Community Question Answering Summarization Corpora , 2018, COMAD/CODS.

[25]  Mirella Lapata,et al.  Don’t Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization , 2018, EMNLP.

[26]  Ramakanth Pasunuru,et al.  Multi-Reward Reinforced Summarization with Saliency and Entailment , 2018, NAACL.

[27]  Lukasz Kaiser,et al.  Generating Wikipedia by Summarizing Long Sequences , 2018, ICLR.

[28]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[29]  Richard Socher,et al.  A Deep Reinforced Model for Abstractive Summarization , 2017, ICLR.

[30]  M. de Rijke,et al.  Summarizing Answers in Non-Factoid Community Question-Answering , 2017, WSDM.

[31]  Peng Wang,et al.  Self-Taught Convolutional Neural Networks for Short Text Clustering , 2017, Neural Networks.

[32]  Vaibhava Goel,et al.  Self-Critical Sequence Training for Image Captioning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Bowen Zhou,et al.  Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond , 2016, CoNLL.

[34]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Marc'Aurelio Ranzato,et al.  Sequence Level Training with Recurrent Neural Networks , 2015, ICLR.

[36]  Noah A. Smith,et al.  Extractive Summarization by Maximizing Semantic Volume , 2015, EMNLP.

[37]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[38]  Claire Cardie,et al.  Query-Focused Opinion Summarization for User-Generated Content , 2014, COLING.

[39]  Tat-Seng Chua,et al.  Community Answer Summarization for Multi-Sentence Question with Group L1 Regularization , 2012, ACL.

[40]  Yong Yu,et al.  Understanding and Summarizing Answers in Community-Based Question Answering Services , 2008, COLING.

[41]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[42]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[43]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[44]  Vasudeva Varma,et al.  Summarizing Answers for Community Question Answer Services , 2013, GSCL.

[45]  G. Carenini,et al.  A Publicly Available Annotated Corpus for Supervised Email Summarization , 2008 .