Quality Prediction of Newly Proposed Questions in CQA by Leveraging Weakly Supervised Learning

Community Question Answering (CQA) websites provide a platform to ask questions and share their knowledge. Good questions in CQA websites can improve user experiences and attract more users. To the best of our knowledge, a few researches have been studied on the question quality, especially the quality of newly proposed questions. In this work, we consider that a good question is popular and answerable in CQA websites. The community features of questions are extracted automatically and utilized to acquire massive good questions. The text features and asker features of good questions are utilized to train our weakly supervised model based on Convolutional Neural Network to recognize good newly proposed questions. We conduct extensive experiments on the publicly available dataset from StackExchange and our best result achieves F1-score at 91.5%, outperforming the baselines.

[1]  Jeffrey Pomerantz,et al.  Evaluating and predicting answer quality in community QA , 2010, SIGIR.

[2]  Ee-Peng Lim,et al.  Quality-aware collaborative question answering: methods and evaluation , 2009, WSDM '09.

[3]  Sheizaf Rafaeli,et al.  Predictors of answer quality in online Q&A sites , 2008, CHI.

[4]  Michael R. Lyu,et al.  Analyzing and predicting question quality in community question answering services , 2012, WWW.

[5]  Ravi Kumar,et al.  Great Question! Question Quality in Community Q&A , 2014, ICWSM.

[6]  Eugene Agichtein,et al.  Learning to recognize reliable users and content in social media with coupled mutual reinforcement , 2009, WWW '09.

[7]  Hongmei Liu,et al.  Answer Quality Prediction Joint Textual and Non-Textual Features , 2016, 2016 13th Web Information Systems and Applications Conference (WISA).

[8]  Qinghua Zheng,et al.  Adaptive Unsupervised Feature Selection With Structure Regularization , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[9]  W. Bruce Croft,et al.  A framework to predict the quality of answers with non-textual features , 2006, SIGIR.

[10]  Grzegorz Chrupala,et al.  Predicting the quality of questions on Stackoverflow , 2015, RANLP.

[11]  Gilad Mishne,et al.  Finding high-quality content in social media , 2008, WSDM '08.

[12]  Jian Zhang,et al.  Convolutional Sparse Autoencoders for Image Classification , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[13]  Alton Yeow-Kuan Chua,et al.  A predictive framework for retrieving the best answer , 2008, SAC '08.

[14]  Qinghua Zheng,et al.  Simple to Complex Cross-modal Learning to Rank , 2017, Comput. Vis. Image Underst..

[15]  Manoj Kumar Chinnakotla,et al.  Deep Feature Fusion Network for Answer Quality Prediction in Community Question Answering , 2016, ArXiv.

[16]  Luo Si,et al.  A probabilistic graphical model for joint answer ranking in question answering , 2007, SIGIR.

[17]  Chirag Shah,et al.  Evaluating the quality of educational answers in community question-answering , 2016, 2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL).