Learning to rank for question routing in community question answering

This paper focuses on the problem of Question Routing (QR) in Community Question Answering (CQA), which aims to route newly posted questions to the potential answerers who are most likely to answer them. Traditional methods to solve this problem only consider the text similarity features between the newly posted question and the user profile, while ignoring the important statistical features, including the question-specific statistical feature and the user-specific statistical features. Moreover, traditional methods are based on unsupervised learning, which is not easy to introduce the rich features into them. This paper proposes a general framework based on the learning to rank concepts for QR. Training sets consist of triples (q, asker, answerers) are first collected. Then, by introducing the intrinsic relationships between the asker and the answerers in each CQA session to capture the intrinsic labels/orders of the users about their expertise degree of the question q, two different methods, including the SVM-based and RankingSVM-based methods, are presented to learn the models with different example creation processes from the training set. Finally, the potential answerers are ranked using the trained models. Extensive experiments conducted on a real world CQA dataset from Stack Overflow show that our proposed two methods can both outperform the traditional query likelihood language model (QLLM) as well as the state-of-the-art Latent Dirichlet Allocation based model (LDA). Specifically, the RankingSVM-based method achieves statistical significant improvements over the SVM-based method and has gained the best performance.

[1]  Hang Li Learning to Rank for Information Retrieval and Natural Language Processing , 2011, Synthesis Lectures on Human Language Technologies.

[2]  Tie-Yan Liu,et al.  Learning to rank for information retrieval , 2009, SIGIR.

[3]  Junjie Yao,et al.  Routing Questions to the Right Users in Online Communities , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[4]  Qing Yang,et al.  Predicting Best Answerers for New Questions in Community Question Answering , 2010, WAIM.

[5]  Michael R. Lyu,et al.  A classification-based approach to question routing in community question answering , 2012, WWW.

[6]  Jun Zhao,et al.  Joint relevance and answer quality learning for question routing in community QA , 2012, CIKM.

[7]  Lada A. Adamic,et al.  Knowledge sharing and yahoo answers: everyone knows something , 2008, WWW.

[8]  Fei Xu,et al.  Dual role model for question recommendation in community question answering , 2012, SIGIR '12.

[9]  Thore Graepel,et al.  Large Margin Rank Boundaries for Ordinal Regression , 2000 .

[10]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[11]  Irwin King,et al.  Routing questions to appropriate answerers in community question answering services , 2010, CIKM.

[12]  Michael R. Lyu,et al.  Question routing in community question answering: putting category in its place , 2011, CIKM '11.

[13]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[14]  Evangelos E. Milios,et al.  Finding expert users in community question answering , 2012, WWW.