Where to Post: Routing Questions to Right Community in Community Question Answering Systems

At present, question-answer (QA) sites have become one of the most important sources of information sharing. In order to ease search and categorization, QA sites create communities to discuss a specific topic or interest. As a consequence, a large number of communities have been created in the last few years. A lot of research has been conducted on community QA sites to address various problems including expert identification and tag recommendation. However, an important problem that has been neglected so far is to automatically route a question to the right community. In this paper, we propose a novel word-embedding based method to route a question to the right community. We use syntactic as well as semantic features to characterize a question and community. Although this approach of characterization performs well, it is highly computationally expensive. To deal with this problem, we use topic modeling, which effectively summarizes a community and reduces the computation time. Our experimental results reveal that usage of both syntactic and semantic features helps in question routing and leads to a better community prediction. We evaluate our methods on a well-known question answering system Stack Exchange and show the effectiveness of the proposed method.

[1]  Jure Leskovec,et al.  Discovering value from community activity on focused question answering sites: a case study of stack overflow , 2012, KDD.

[2]  Xiao Ma,et al.  From Word Embeddings to Document Similarities for Improved Information Retrieval in Software Engineering , 2016, 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE).

[3]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[4]  Po Hu,et al.  Learning Continuous Word Embedding with Metadata for Question Retrieval in Community Question Answering , 2015, ACL.

[5]  Yee Whye Teh,et al.  On Smoothing and Inference for Topic Models , 2009, UAI.

[6]  W. Pirie Spearman Rank Correlation Coefficient , 2006 .

[7]  Efstathios Stamatatos,et al.  Syntactic N-grams as machine learning features for natural language processing , 2014, Expert Syst. Appl..

[8]  David Lo,et al.  Tag recommendation in software information sites , 2013, 2013 10th Working Conference on Mining Software Repositories (MSR).

[9]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[10]  Preslav Nakov,et al.  SemanticZ at SemEval-2016 Task 3: Ranking Relevant Answers in Community Question Answering Using Semantic Similarity Based on Fine-tuned Word Embeddings , 2016, *SEMEVAL.

[11]  Yueting Zhuang,et al.  Expert Finding for Community-Based Question Answering via Ranking Metric Network Learning , 2016, IJCAI.

[12]  Suresh Manandhar,et al.  Tag-based expert recommendation in community question answering , 2014, 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014).

[13]  Baoxin Li,et al.  Towards Predicting the Best Answers in Community-based Question-Answering Services , 2013, ICWSM.

[14]  Çigdem Aslay,et al.  Competition-based networks for expert finding , 2013, SIGIR.

[15]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[16]  Gilad Mishne,et al.  Finding high-quality content in social media , 2008, WSDM '08.

[17]  Weizhong Zhao,et al.  A heuristic approach to determine an appropriate number of topics in topic modeling , 2015, BMC Bioinformatics.

[18]  Ming Liu,et al.  Multimodal DBN for Predicting High-Quality Answers in cQA portals , 2013, ACL.

[19]  Sinno Jialin Pan,et al.  Short and Sparse Text Topic Modeling via Self-Aggregation , 2015, IJCAI.

[20]  Wilfred Ng,et al.  Expert Finding for Question Answering via Graph Regularized Matrix Completion , 2015, IEEE Transactions on Knowledge and Data Engineering.

[21]  Tat-Seng Chua,et al.  Discovering high quality answers in community question answering archives using a hierarchy of classifiers , 2014, Inf. Sci..

[22]  David Lo,et al.  EnTagRec++: An enhanced tag recommendation system for software information sites , 2014, 2014 IEEE International Conference on Software Maintenance and Evolution.

[23]  Feng Xu,et al.  Detecting high-quality posts in community question answering sites , 2015, Inf. Sci..

[24]  Zhiyuan Liu,et al.  Relation Classification via Multi-Level Attention CNNs , 2016, ACL.

[25]  James P. Callan,et al.  Moving from Static to Dynamic Modeling of Expertise for Question Routing in CQA Sites , 2015, ICWSM.

[26]  Mária Bieliková,et al.  Utilizing non-QA data to improve questions routing for users with low QA activity in CQA , 2015, 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[27]  Hamid Beigy,et al.  On dynamicity of expert finding in community question answering , 2017, Inf. Process. Manag..

[28]  Grzegorz Chrupala,et al.  Question Quality in Community Question Answering Forums: a survey , 2015, SKDD.

[29]  Ravi Kumar,et al.  Great Question! Question Quality in Community Q&A , 2014, ICWSM.

[30]  Jun Zhao,et al.  Relation Classification via Convolutional Deep Neural Network , 2014, COLING.

[31]  Anand Konjengbam,et al.  Using Social Media for Word-of-Mouth Marketing , 2017, DaWaK.

[32]  Yueting Zhuang,et al.  Community-Based Question Answering via Heterogeneous Social Network Learning , 2016, AAAI.

[33]  Timothy Baldwin,et al.  An Empirical Evaluation of doc2vec with Practical Insights into Document Embedding Generation , 2016, Rep4NLP@ACL.