PASS: Perturb-and-Select Summarizer for Product Reviews

The product reviews summarization task aims to automatically produce a short summary for a set of reviews of a given product. Such summaries are expected to aggregate a range of different opinions in a concise, coherent and informative manner. This challenging task gives rise to two shortcomings in existing work. First, summarizers tend to favor generic content that appears in reviews for many different products, resulting in template-like, less informative summaries. Second, as reviewers often disagree on the pros and cons of a given product, summarizers sometimes yield inconsistent, self-contradicting summaries. We propose the PASS system (Perturb-and-Select Summarizer) that employs a large pre-trained Transformer-based model (T5 in our case), which follows a few-shot fine-tuning scheme. A key component of the PASS system relies on applying systematic perturbations to the model’s input during inference, which allows it to generate multiple different summaries per product. We develop a method for ranking these summaries according to desired criteria, coherence in our case, enabling our system to almost entirely avoid the problem of selfcontradiction. We compare our system against strong baselines on publicly available datasets, and show that it produces summaries which are more informative, diverse and coherent.1

[1]  Richard Socher,et al.  Neural Text Summarization: A Critical Evaluation , 2019, EMNLP.

[2]  Datasets , 2021, Algebraic Analysis of Social Networks.

[3]  Omer Levy,et al.  BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.

[4]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[5]  Jackie Chi Kit Cheung,et al.  Multi-Document Summarization of Evaluative Text , 2013, EACL.

[6]  Colin Raffel,et al.  Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[7]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[8]  Yao Zhao,et al.  PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization , 2020, ICML.

[9]  Ivan Titov,et al.  Few-Shot Learning for Opinion Summarization , 2020, EMNLP.

[10]  Ivan Titov,et al.  Unsupervised Opinion Summarization as Copycat-Review Generation , 2020, ACL.

[11]  Yoshihiko Suhara,et al.  OpinionDigest: A Simple Framework for Opinion Summarization , 2020, ACL.

[12]  Jordan J. Louviere,et al.  Best-Worst Scaling: Theory, Methods and Applications , 2015 .

[13]  Ashwin K. Vijayakumar,et al.  Diverse Beam Search: Decoding Diverse Solutions from Neural Sequence Models , 2016, ArXiv.

[14]  Xiaodong Gu,et al.  Aspect-based Opinion Summarization with Convolutional Neural Networks , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[15]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[16]  Furu Wei,et al.  Faithful to the Original: Fact Aware Neural Abstractive Summarization , 2017, AAAI.

[17]  Lysandre Debut,et al.  HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[18]  Diane J. Litman,et al.  Empirical analysis of exploiting review helpfulness for extractive summarization of online reviews , 2014, COLING.

[19]  Mirella Lapata,et al.  Extractive Opinion Summarization in Quantized Transformer Spaces , 2020, Transactions of the Association for Computational Linguistics.

[20]  Reinald Kim Amplayo,et al.  Unsupervised Opinion Summarization with Content Planning , 2020, AAAI.

[21]  Durga Toshniwal,et al.  Aspect based Summarization of Context Dependent Opinion Words , 2014, KES.

[22]  Jun-Ping Ng,et al.  Better Summarization Evaluation with Word Embeddings for ROUGE , 2015, EMNLP.

[23]  Chin-Yew Lin,et al.  Looking for a Few Good Metrics: Automatic Summarization Evaluation - How Many Samples Are Enough? , 2004, NTCIR.

[24]  Alon Lavie,et al.  METEOR: An Automatic Metric for MT Evaluation with High Levels of Correlation with Human Judgments , 2007, WMT@ACL.

[25]  B. Orme MaxDiff Analysis : Simple Counting , Individual-Level Logit , and HB , 2009 .

[26]  Mirella Lapata,et al.  Unsupervised Opinion Summarization with Noising and Denoising , 2020, ACL.

[27]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[28]  Ryan McDonald,et al.  On Faithfulness and Factuality in Abstractive Summarization , 2020, ACL.

[29]  Saif Mohammad,et al.  Capturing Reliable Fine-Grained Sentiment Associations by Crowdsourcing and Best–Worst Scaling , 2016, NAACL.

[30]  Alexander Kotov,et al.  Sentence Retrieval with Sentiment-specific Topical Anchoring for Review Summarization , 2017, CIKM.

[31]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[32]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[33]  Sasha Blair-Goldensohn,et al.  Sentiment Summarization: Evaluating and Learning User Preferences , 2009, EACL.

[34]  Mirella Lapata,et al.  Summarizing Opinions: Aspect Extraction Meets Sentiment Prediction and They Are Both Weakly Supervised , 2018, EMNLP.

[35]  Claudiu Musat,et al.  Diverse Beam Search for Increased Novelty in Abstractive Summarization , 2018, ArXiv.

[36]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[37]  Kilian Q. Weinberger,et al.  BERTScore: Evaluating Text Generation with BERT , 2019, ICLR.

[38]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[39]  Frank Hutter,et al.  Decoupled Weight Decay Regularization , 2017, ICLR.

[40]  Phil Blunsom,et al.  Teaching Machines to Read and Comprehend , 2015, NIPS.

[41]  Eric Chu,et al.  MeanSum: A Neural Model for Unsupervised Multi-Document Abstractive Summarization , 2018, ICML.

[42]  Saif Mohammad,et al.  Best-Worst Scaling More Reliable than Rating Scales: A Case Study on Sentiment Intensity Annotation , 2017, ACL.

[43]  Hoa Trang Dang,et al.  Overview of DUC 2005 , 2005 .

[44]  Matthias Gallé,et al.  Unsupervised Aspect-Based Multi-Document Abstractive Summarization , 2019, EMNLP.

[45]  Ran Levy,et al.  Massive Multi-Document Summarization of Product Reviews with Weak Supervision , 2020, ArXiv.