Joint Learning of Answer Selection and Answer Summary Generation in Community Question Answering

Community question answering (CQA) gains increasing popularity in both academy and industry recently. However, the redundancy and lengthiness issues of crowdsourced answers limit the performance of answer selection and lead to reading difficulties and misunderstandings for community users. To solve these problems, we tackle the tasks of answer selection and answer summary generation in CQA with a novel joint learning model. Specifically, we design a question-driven pointer-generator network, which exploits the correlation information between question-answer pairs to aid in attending the essential information when generating answer summaries. Meanwhile, we leverage the answer summaries to alleviate noise in original lengthy answers when ranking the relevancy degrees of question-answer pairs. In addition, we construct a new large-scale CQA corpus, WikiHowQA, which contains long answers for answer selection as well as reference summaries for answer summarization. The experimental results show that the joint learning method can effectively address the answer redundancy issue in CQA and achieves state-of-the-art results on both answer selection and text summarization tasks. Furthermore, the proposed model is shown to be of great transferring ability and applicability for resource-poor CQA tasks, which lack of reference answer summaries.

[1]  Lei Chen,et al.  Knowledge-enhanced Hierarchical Attention for Community Question Answering with Multi-task and Adaptive Learning , 2019, IJCAI.

[2]  Di Wang,et al.  A Long Short-Term Memory Model for Answer Sentence Selection in Question Answering , 2015, ACL.

[3]  Bowen Zhou,et al.  SummaRuNNer: A Recurrent Neural Network Based Sequence Model for Extractive Summarization of Documents , 2016, AAAI.

[4]  Christopher D. Manning,et al.  Probabilistic Tree-Edit Models with Structured Latent Variables for Textual Entailment and Question Answering , 2010, COLING.

[5]  Preslav Nakov,et al.  Joint Multitask Learning for Community Question Answering Using Task-Specific Embeddings , 2018, EMNLP.

[6]  Wei Wu,et al.  Question Condensing Networks for Answer Selection in Community Question Answering , 2018, ACL.

[7]  Balaraman Ravindran,et al.  Diversity driven attention model for query-based abstractive summarization , 2017, ACL.

[8]  Eunsol Choi,et al.  Coarse-to-Fine Question Answering for Long Documents , 2016, ACL.

[9]  Bowen Zhou,et al.  Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond , 2016, CoNLL.

[10]  William Yang Wang,et al.  WikiHow: A Large Scale Text Summarization Dataset , 2018, ArXiv.

[11]  Alessandro Moschitti,et al.  Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks , 2015, SIGIR.

[12]  Yang Deng,et al.  Knowledge-aware Attentive Neural Network for Ranking Question Answer Pairs , 2018, SIGIR.

[13]  Bowen Zhou,et al.  Improved Representation Learning for Question Answer Matching , 2016, ACL.

[14]  Yang Deng,et al.  Multi-Task Learning with Multi-View Attention for Answer Selection and Knowledge Base Question Answering , 2018, AAAI.

[15]  Junji Tomita,et al.  Multi-style Generative Reading Comprehension , 2019, ACL.

[16]  Bowen Zhou,et al.  Attentive Pooling Networks , 2016, ArXiv.

[17]  Kyomin Jung,et al.  Learning to Rank Question-Answer Pairs Using Hierarchical Recurrent Encoder with Latent Topic Clustering , 2017, NAACL.

[18]  Preslav Nakov,et al.  SemEval-2017 Task 3: Community Question Answering , 2017, *SEMEVAL.

[19]  Dietrich Klakow,et al.  Long-Span Language Models for Query-Focused Unsupervised Extractive Text Summarization , 2018, ECIR.

[20]  Minlie Huang,et al.  Metadata-Aware Measures for Answer Summarization in Community Question Answering , 2010, ACL.

[21]  Shuohang Wang,et al.  A Compare-Aggregate Model for Matching Text Sequences , 2016, ICLR.

[22]  Zhen Wang,et al.  Joint Training of Candidate Extraction and Answer Selection for Reading Comprehension , 2018, ACL.

[23]  Mirella Lapata,et al.  Neural Summarization by Extracting Sentences and Words , 2016, ACL.

[24]  Wei Chu,et al.  Modelling Domain Relationships for Transfer Learning on Retrieval-based Question Answering Systems in E-commerce , 2017, WSDM.

[25]  M. de Rijke,et al.  Summarizing Answers in Non-Factoid Community Question-Answering , 2017, WSDM.

[26]  Jingwei Ma,et al.  Hybrid Attentive Answer Selection in CQA With Deep Users Modelling , 2018, AAAI.

[27]  Tiejun Zhao,et al.  Neural Document Summarization by Jointly Learning to Score and Select Sentences , 2018, ACL.

[28]  Mitesh M. Khapra,et al.  DuoRC: Towards Complex Language Understanding with Paraphrased Reading Comprehension , 2018, ACL.

[29]  Iryna Gurevych,et al.  COALA: A Neural Coverage-Based Approach for Long Answer Selection with Small Data , 2019, AAAI.

[30]  Wei Fan,et al.  Reliable Medical Diagnosis from Crowdsourcing: Discover Trustworthy Answers from Non-Experts , 2017, WSDM.

[31]  Kai Wang,et al.  A syntactic tree matching approach to finding similar questions in community-based qa services , 2009, SIGIR.

[32]  Daniele Bonadiman,et al.  Effective shared representations with Multitask Learning for Community Question Answering , 2017, EACL.

[33]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.

[34]  W. Bruce Croft,et al.  WikiPassageQA: A Benchmark Collection for Research on Non-factoid Answer Passage Retrieval , 2018, SIGIR.

[35]  Yi Yang,et al.  WikiQA: A Challenge Dataset for Open-Domain Question Answering , 2015, EMNLP.

[36]  Liang Zhou,et al.  Summarizing Answers for Complicated Questions , 2006, LREC.

[37]  Christopher D. Manning,et al.  Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[38]  Yang Deng,et al.  Knowledge as A Bridge: Improving Cross-domain Answer Selection with External Knowledge , 2018, COLING.

[39]  Jonas Mueller,et al.  Siamese Recurrent Architectures for Learning Sentence Similarity , 2016, AAAI.