How to Ask Better Questions? A Large-Scale Multi-Domain Dataset for Rewriting Ill-Formed Questions

We present a large-scale dataset for the task of rewriting an ill-formed natural language question to a well-formed one. Our multi-domain question rewriting MQR dataset is constructed from human contributed Stack Exchange question edit histories. The dataset contains 427,719 question pairs which come from 303 domains. We provide human annotations for a subset of the dataset as a quality estimate. When moving from ill-formed to well-formed questions, the question quality improves by an average of 45 points across three aspects. We train sequence-to-sequence neural models on the constructed dataset and obtain an improvement of 13.2% in BLEU-4 over baseline methods built from other data resources. We release the MQR dataset to encourage research on the problem of question rewriting.

[1]  Nitin Madnani,et al.  Generating Phrasal and Sentential Paraphrases: A Survey of Data-Driven Methods , 2010, CL.

[2]  Benjamin Rey,et al.  Generating query substitutions , 2006, WWW '06.

[3]  Mirella Lapata,et al.  Paraphrasing Revisited with Neural Machine Translation , 2017, EACL.

[4]  M. McHugh Interrater reliability: the kappa statistic , 2012, Biochemia medica.

[5]  Ion Androutsopoulos,et al.  A Generate and Rank Approach to Sentence Paraphrasing , 2011, EMNLP.

[6]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[7]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[8]  Regina Barzilay,et al.  Learning to Paraphrase: An Unsupervised Approach Using Multiple-Sequence Alignment , 2003, NAACL.

[9]  Luke S. Zettlemoyer,et al.  Adversarial Example Generation with Syntactically Controlled Paraphrase Networks , 2018, NAACL.

[10]  Wei Vivian Zhang,et al.  Comparing Click Logs and Editorial Labels for Training Query Rewriting , 2007 .

[11]  Manaal Faruqui,et al.  Identifying Well-formed Natural Language Questions , 2018, EMNLP.

[12]  Hang Li,et al.  Paraphrase Generation with Deep Reinforcement Learning , 2017, EMNLP.

[13]  Kevin Gimpel,et al.  Pushing the Limits of Paraphrastic Sentence Embeddings with Millions of Machine Translations , 2017, ArXiv.

[14]  Alon Lavie,et al.  METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments , 2005, IEEvaluation@ACL.

[15]  Ion Androutsopoulos,et al.  A Survey of Paraphrasing and Textual Entailment Methods , 2009, J. Artif. Intell. Res..

[16]  Eunah Cho,et al.  Paraphrase Generation for Semi-Supervised Learning in NLU , 2019, Proceedings of the Workshop on Methods for Optimizing and Evaluating Neural Language Generation.

[17]  Adarsh Kumar,et al.  Translating Web Search Queries into Natural Language Questions , 2018, LREC.

[18]  Qun Liu,et al.  Decomposable Neural Paraphrase Generation , 2019, ACL.

[19]  Jannis Bulian,et al.  Ask the Right Questions: Active Question Reformulation with Reinforcement Learning , 2017, ICLR.

[20]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[21]  Oren Etzioni,et al.  Paraphrase-Driven Learning for Open Question Answering , 2013, ACL.

[22]  Xiaofei He,et al.  Query rewriting using active learning for sponsored search , 2007, SIGIR.

[23]  Jacob Eisenstein,et al.  What to do about bad language on the internet , 2013, NAACL.

[24]  Mirella Lapata,et al.  Learning to Paraphrase for Question Answering , 2017, EMNLP.

[25]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[26]  Kevin Gimpel,et al.  Controllable Paraphrase Generation with a Syntactic Exemplar , 2019, ACL.

[27]  Carlos Guestrin,et al.  Semantically Equivalent Adversarial Rules for Debugging NLP models , 2018, ACL.

[28]  Shankar Kumar,et al.  Normalization of non-standard words , 2001, Comput. Speech Lang..

[29]  Ankush Gupta,et al.  A Deep Generative Framework for Paraphrase Generation , 2017, AAAI.

[30]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[31]  Stefan Riezler,et al.  Learning Dense Models of Query Similarity from User Click Logs , 2010, NAACL.

[32]  Chris Callison-Burch,et al.  Paraphrasing with Bilingual Parallel Corpora , 2005, ACL.

[33]  Noam M. Shazeer,et al.  Corpora Generation for Grammatical Error Correction , 2019, NAACL.

[34]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.