Query-focused Sentence Compression in Linear Time

Search applications often display shortened sentences which must contain certain query terms and must fit within the space constraints of a user interface. This work introduces a new transition-based sentence compression technique developed for such settings. Our query-focused method constructs length and lexically constrained compressions in linear time, by growing a subgraph in the dependency parse of a sentence. This theoretically efficient approach achieves an 11X empirical speedup over baseline ILP methods, while better reconstructing gold constrained shortenings. Such speedups help query-focused applications, because users are measurably hindered by interface lags. Additionally, our technique does not require an ILP solver or a GPU.

[1]  Mirella Lapata,et al.  Sentence Compression for Arbitrary Languages via Multilingual Pivoting , 2018, EMNLP.

[2]  Gary Marchionini,et al.  Exploratory search , 2006, Commun. ACM.

[3]  Christopher D. Manning,et al.  Enhanced English Universal Dependencies: An Improved Representation for Natural Language Understanding Tasks , 2016, LREC.

[4]  Marti A. Hearst Search User Interfaces , 2009 .

[5]  Sigrid Klerke,et al.  Improving sentence compression by learning to predict gaze , 2016, NAACL.

[6]  Alexander Clark,et al.  Unsupervised Prediction of Acceptability Judgements , 2015, ACL.

[7]  Yasemin Altun,et al.  Overcoming the Lack of Parallel Data in Sentence Compression , 2013, EMNLP.

[8]  Danqi Chen,et al.  A Fast and Accurate Dependency Parser using Neural Networks , 2014, EMNLP.

[9]  Katherine A. Keith,et al.  Monte Carlo Syntax Marginals for Exploring and Using Dependency Parses , 2018, NAACL.

[10]  Graham Neubig,et al.  Controlling Output Length in Neural Encoder-Decoders , 2016, EMNLP.

[11]  Dan Klein,et al.  Jointly Learning to Extract and Compress , 2011, ACL.

[12]  Matt Post,et al.  Fast Lexically Constrained Decoding with Dynamic Beam Allocation for Neural Machine Translation , 2018, NAACL.

[13]  Kenneth Heafield,et al.  KenLM: Faster and Smaller Language Model Queries , 2011, WMT@EMNLP.

[14]  Sampo Pyysalo,et al.  Universal Dependencies v1: A Multilingual Treebank Collection , 2016, LREC.

[15]  Katharina Kann,et al.  Sentence-Level Fluency Evaluation: References Help, But Can Be Spared! , 2018, CoNLL.

[16]  Christopher Potts,et al.  A large annotated corpus for learning natural language inference , 2015, EMNLP.

[17]  Amy X. Zhang,et al.  Making Sense of Group Chat through Collaborative Tagging and Summarization , 2018, Proc. ACM Hum. Comput. Interact..

[18]  Enrique Alfonseca,et al.  Fast k-best Sentence Compression , 2015, ArXiv.

[19]  Chris Callison-Burch,et al.  Evaluating Sentence Compression: Pitfalls and Suggested Remedies , 2011, Monolingual@ACL.

[20]  Daniel Marcu,et al.  Statistics-Based Summarization - Step One: Sentence Compression , 2000, AAAI/IAAI.

[21]  Chris Callison-Burch,et al.  So-Called Non-Subsective Adjectives , 2016, *SEM@ACL.

[22]  Holger Schwenk,et al.  Supervised Learning of Universal Sentence Representations from Natural Language Inference Data , 2017, EMNLP.

[23]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[24]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[25]  Ted Briscoe,et al.  The Second Release of the RASP System , 2006, ACL.

[26]  R. Thomas McCoy,et al.  Non-entailed subsequences as a challenge for natural language inference , 2018, ArXiv.

[27]  Jeffrey Heer,et al.  The Effects of Interactive Latency on Exploratory Visual Analysis , 2014, IEEE Transactions on Visualization and Computer Graphics.

[28]  Dan Klein,et al.  An Empirical Investigation of Statistical Significance in NLP , 2012, EMNLP.

[29]  Lejian Liao,et al.  Can Syntax Help? Improving an LSTM-based Sentence Compression Model for New Domains , 2017, ACL.

[30]  Ben Shneiderman,et al.  Interactive Dynamics for Visual Analysis , 2012 .

[31]  Alexander M. Rush,et al.  Bottom-Up Abstractive Summarization , 2018, EMNLP.

[32]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[33]  André F. T. Martins,et al.  Fast and Robust Compressive Summarization with Dual Decomposition and Multi-Task Learning , 2013, ACL.

[34]  Mark Sanderson,et al.  Advantages of query biased summaries in information retrieval , 1998, SIGIR '98.

[35]  J. Clarke,et al.  Global inference for sentence compression : an integer linear programming approach , 2008, J. Artif. Intell. Res..

[36]  Jason Weston,et al.  A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.

[37]  Amanda Spink,et al.  Real life, real users, and real needs: a study and analysis of user queries on the web , 2000, Inf. Process. Manag..

[38]  Jeffrey Heer,et al.  Interpretation and trust: designing model-driven visualizations for text analysis , 2012, CHI.

[39]  Jakob Nielsen,et al.  Usability engineering , 1997, The Computer Science and Engineering Handbook.

[40]  David Bamman,et al.  Natural Language Processing for the Long Tail , 2017, DH.

[41]  Najafi Azadeh,et al.  REAL LIFE, REAL USERS AND REAL NEEDS: A STUDY AND ANALYSIS OF USER QUERIES ON THE WEB , 2008 .

[42]  Yoshua Bengio,et al.  Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[43]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[44]  Michael Strube,et al.  Dependency Tree Based Sentence Compression , 2008, INLG.

[45]  Lukasz Kaiser,et al.  Sentence Compression by Deletion with LSTMs , 2015, EMNLP.

[46]  Joakim Nivre,et al.  An Efficient Algorithm for Projective Dependency Parsing , 2003, IWPT.

[47]  Luke S. Zettlemoyer,et al.  AllenNLP: A Deep Semantic Natural Language Processing Platform , 2018, ArXiv.

[48]  Rosie Jones,et al.  The Linguistic Structure of English Web-Search Queries , 2008, EMNLP.