Effective Crowd Expertise Modeling via Cross Domain Sparsity and Uncertainty Reduction

Characterizations of crowd expertise is vital to online applications where the crowd plays a central role, such as StackExchange for question-answering and LinkedIn as a workforce market. With accurately estimated worker expertise, new jobs can be assigned to the right workers more effectively and efficiently. Most existing methods solely rely on the sparse worker-job interactions, leading to poorly estimated expertise that does not generalize well to a large amount of unseen jobs. Though transfer learning can utilize external domains to mitigate the sparsity, the auxiliary domains can themselves suffer from incomplete information, leading to inferior performance. There is a lack of principled framework to handle the sparse and incomplete data to achieve better expertise modeling. Based on multitask learning, we propose a framework that uses the knowledge learned from one domain to gradually resolve the data sparsity or incompleteness problem in the other alternatively. Experimental results on several question-answering datasets demonstrate the effectiveness and convergence of the iterative framework.

[1]  Julien Mairal,et al.  Network Flow Algorithms for Structured Sparsity , 2010, NIPS.

[2]  Hal Daumé,et al.  Learning Task Grouping and Overlap in Multi-task Learning , 2012, ICML.

[3]  Philip S. Yu,et al.  NCR: A Scalable Network-Based Approach to Co-Ranking in Question-and-Answer Sites , 2014, CIKM.

[4]  Theodoros Lappas,et al.  Finding a team of experts in social networks , 2009, KDD.

[5]  Eric P. Xing,et al.  A multivariate regression approach to association analysis of a quantitative trait network , 2008, Bioinform..

[6]  Qiang Yang,et al.  Transfer Learning in Collaborative Filtering for Sparsity Reduction , 2010, AAAI.

[7]  Xin Liu,et al.  Document clustering based on non-negative matrix factorization , 2003, SIGIR.

[8]  Jun Zhao,et al.  Topic-sensitive probabilistic model for expert finding in question answer communities , 2012, CIKM.

[9]  Zhe Zhao,et al.  Improving User Topic Interest Profiles by Behavior Factorization , 2015, WWW.

[10]  Yizhou Sun,et al.  Mining Heterogeneous Information Networks: Principles and Methodologies , 2012, Mining Heterogeneous Information Networks: Principles and Methodologies.

[11]  Huiping Sun,et al.  CQArank: jointly model topics and expertise in community question answering , 2013, CIKM.

[12]  Francis R. Bach,et al.  Structured Variable Selection with Sparsity-Inducing Norms , 2009, J. Mach. Learn. Res..

[13]  Philip S. Yu,et al.  Influence and similarity on heterogeneous networks , 2012, CIKM.

[14]  Çigdem Aslay,et al.  Competition-based networks for expert finding , 2013, SIGIR.

[15]  R. Tibshirani,et al.  The solution path of the generalized lasso , 2010, 1005.1971.

[16]  Lorenzo Rosasco,et al.  Convex Learning of Multiple Tasks and their Structure , 2015, ICML.

[17]  Hyung Jin Kim,et al.  LinkedIn skills: large-scale topic extraction and inference , 2014, RecSys '14.

[18]  Kristen Grauman,et al.  Learning with Whom to Share in Multi-task Feature Learning , 2011, ICML.

[19]  Xi Chen,et al.  Smoothing proximal gradient method for general structured sparse regression , 2010, The Annals of Applied Statistics.

[20]  Yoram Singer,et al.  Efficient Online and Batch Learning Using Forward Backward Splitting , 2009, J. Mach. Learn. Res..

[21]  Julien Mairal,et al.  Proximal Methods for Sparse Hierarchical Dictionary Learning , 2010, ICML.

[22]  W. Bruce Croft,et al.  Finding experts in community-based question-answering services , 2005, CIKM '05.

[23]  Qing Yang,et al.  Predicting Best Answerers for New Questions in Community Question Answering , 2010, WAIM.

[24]  Thomas Hofmann,et al.  TrueSkill™: A Bayesian Skill Rating System , 2007 .

[25]  Arun Rajkumar,et al.  A Statistical Convergence Perspective of Algorithms for Rank Aggregation from Pairwise Data , 2014, ICML.

[26]  Evangelos E. Milios,et al.  Finding expert users in community question answering , 2012, WWW.

[27]  Stephen P. Boyd,et al.  Network Lasso: Clustering and Optimization in Large Graphs , 2015, KDD.

[28]  Philip S. Yu,et al.  Integrating Clustering and Ranking on Hybrid Heterogeneous Information Network , 2013, PAKDD.

[29]  Aristides Gionis,et al.  The community-search problem and how to plan a successful cocktail party , 2010, KDD.

[30]  Jun Zhao,et al.  Joint relevance and answer quality learning for question routing in community QA , 2012, CIKM.

[31]  Young-In Song,et al.  Competition-based user expertise score estimation , 2011, SIGIR.

[32]  Philip S. Yu,et al.  Robust crowd bias correction via dual knowledge transfer from multiple overlapping sources , 2015, 2015 IEEE International Conference on Big Data (Big Data).

[33]  Alessandro Bozzon,et al.  Choosing the right crowd: expert finding in social networks , 2013, EDBT '13.

[34]  Wilfred Ng,et al.  SocialTransfer: Transferring Social Knowledge for Cold-Start Cowdsourcing , 2014, CIKM.

[35]  Xi Chen,et al.  Graph-Structured Multi-task Regression and an Efficient Optimization Method for General Fused Lasso , 2010, ArXiv.

[36]  Alexandros Karatzoglou,et al.  Question recommendation for collaborative question answering systems with RankSLDA , 2014, RecSys '14.

[37]  Eric P. Xing,et al.  Tree-Guided Group Lasso for Multi-Task Regression with Structured Sparsity , 2009, ICML.

[38]  Qiang Yang,et al.  Transfer learning for collaborative filtering via a rating-matrix generative model , 2009, ICML '09.

[39]  Stéphan Clémençon,et al.  Learning the Graph of Relations Among Multiple Tasks , 2013 .

[40]  Lei Chen,et al.  WiseMarket: a new paradigm for managing wisdom of online social users , 2013, KDD.

[41]  Jieping Ye,et al.  Moreau-Yosida Regularization for Grouped Tree Structure Learning , 2010, NIPS.

[42]  Mark S. Ackerman,et al.  Expertise networks in online communities: structure and algorithms , 2007, WWW '07.