Retrieving Rising Stars in Focused Community Question-Answering

In Community Question Answering (CQA)‘ forums, there is typically a small fraction of users who provide high-quality posts and earn a very high reputation status from the community. These top contributors are critical to the community since they drive the development of the site and attract traffic from Internet users. Identifying these individuals could be highly valuable, but this is not an easy task. Unlike publication or social networks, most CQA sites lack information regarding peers, friends, or collaborators, which can be an important indicator signaling future success or performance. In this paper, we attempt to perform this analysis by extracting different sets of features to predict future contribution. The experiment covers 376,000 users who remain active in Stack Overflow for at least one year and together contribute more than 21 million posts. One of the highlights of our approach is that we can identify rising stars after short observations. Our approach achieves high accuracy, 85 %, when predicting whether a user will become a top contributor after a few weeks of observation. As a slightly different problem in which we could observe a few posts by a user, our method achieves accuracy higher than 90 %. Our approach provides higher accuracy than baselines methods including a popular time series analysis. Furthermore, our methods are robust to different classifier algorithms. Identifying the rising stars early could help CQA administrators gain an overview of the site’s future and ensure that enough incentive and support is given to potential contributors.

[1]  Nitesh V. Chawla,et al.  Data Mining for Imbalanced Datasets: An Overview , 2005, The Data Mining and Knowledge Discovery Handbook.

[2]  Sheizaf Rafaeli,et al.  Predictors of answer quality in online Q&A sites , 2008, CHI.

[3]  Michael R. Lyu,et al.  Analyzing and predicting question quality in community question answering services , 2012, WWW.

[4]  Maurice Tchuente,et al.  Churn Prediction in a Real Online Social Network Using Local CommunIty Analysis , 2012, 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining.

[5]  Hanghang Tong,et al.  User churn in focused question answering sites: characterizations and prediction , 2014, WWW.

[6]  Ryen W. White,et al.  Effects of expertise differences in synchronous social Q&A , 2012, SIGIR '12.

[7]  Jeffrey Pomerantz,et al.  Evaluating and predicting answer quality in community QA , 2010, SIGIR.

[8]  Tina Eliassi-Rad,et al.  Hyperlocal: inferring location of IP addresses in real-time bid requests for mobile ads , 2013, LBSN '13.

[9]  Lada A. Adamic,et al.  Knowledge sharing and yahoo answers: everyone knows something , 2008, WWW.

[10]  Joseph A. Konstan,et al.  Evolution of Experts in Question Answering Communities , 2012, ICWSM.

[11]  David Lo,et al.  Collective Churn Prediction in Social Network , 2012, 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining.

[12]  Michael Vitale,et al.  The Wisdom of Crowds , 2015, Cell.

[13]  Ali Daud,et al.  Finding Rising Stars in Social Networks , 2013, DASFAA.

[14]  Eugene Agichtein,et al.  When web search fails, searchers become askers: understanding the transition , 2012, SIGIR '12.

[15]  Idan Szpektor,et al.  Will My Question Be Answered? Predicting "Question Answerability" in Community Question-Answering Sites , 2013, ECML/PKDD.

[16]  Yair Movshovitz-Attias,et al.  Analysis of the reputation system and user contributions on a question answering website: StackOverflow , 2013, 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2013).

[17]  Chirag Shah,et al.  Social Q&A and virtual reference - comparing apples and oranges with the help of experts and users , 2012, J. Assoc. Inf. Sci. Technol..

[18]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[19]  Robert H. Shumway,et al.  Time series analysis and its applications : with R examples , 2017 .

[20]  See-Kiong Ng,et al.  Searching for Rising Stars in Bibliography Networks , 2009, DASFAA.

[21]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[22]  Chirag Shah,et al.  "How much change do you get from 40$?" - Analyzing and addressing failed questions on social Q&A , 2012, ASIST.

[23]  J. Oh,et al.  Research agenda for social Q&A , 2009 .

[24]  Yong Yu,et al.  Analyzing and Predicting Not-Answered Questions in Community-based Question Answering Services , 2011, AAAI.