Cross-domain collaboration recommendation

Interdisciplinary collaborations have generated huge impact to society. However, it is often hard for researchers to establish such cross-domain collaborations. What are the patterns of cross-domain collaborations? How do those collaborations form? Can we predict this type of collaborations? Cross-domain collaborations exhibit very different patterns compared to traditional collaborations in the same domain: 1) sparse connection: cross-domain collaborations are rare; 2) complementary expertise: cross-domain collaborators often have different expertise and interest; 3) topic skewness: cross-domain collaboration topics are focused on a subset of topics. All these patterns violate fundamental assumptions of traditional recommendation systems. In this paper, we analyze the cross-domain collaboration data from research publications and confirm the above patterns. We propose the Cross-domain Topic Learning (CTL) model to address these challenges. For handling sparse connections, CTL consolidates the existing cross-domain collaborations through topic layers instead of at author layers, which alleviates the sparseness issue. For handling complementary expertise, CTL models topic distributions from source and target domains separately, as well as the correlation across domains. For handling topic skewness, CTL only models relevant topics to the cross-domain collaboration. We compare CTL with several baseline approaches on large publication datasets from different domains. CTL outperforms baselines significantly on multiple recommendation metrics. Beyond accurate recommendation performance, CTL is also insensitive to parameter tuning as confirmed in the sensitivity analysis.

[1]  George Karypis,et al.  Item-based top-N recommendation algorithms , 2004, TOIS.

[2]  Christos Faloutsos,et al.  Sampling from large graphs , 2006, KDD '06.

[3]  M. de Rijke,et al.  Formal models for expert finding in enterprise corpora , 2006, SIGIR.

[4]  Abhinandan Das,et al.  Google news personalization: scalable online collaborative filtering , 2007, WWW '07.

[5]  Gregor Heinrich Parameter estimation for text analysis , 2009 .

[6]  John Quackenbush Microarray analysis and tumor classification. , 2006, The New England journal of medicine.

[7]  Andrew McCallum,et al.  Expertise modeling for matching papers with reviewers , 2007, KDD '07.

[8]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[9]  Mark S. Granovetter The Strength of Weak Ties , 1973, American Journal of Sociology.

[10]  Jie Tang,et al.  ArnetMiner: extraction and mining of academic social networks , 2008, KDD.

[11]  Yoav Shoham,et al.  Fab: content-based, collaborative recommendation , 1997, CACM.

[12]  Jimeng Sun,et al.  Neighborhood formation and anomaly detection in bipartite graphs , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[13]  Nando de Freitas,et al.  Rao-Blackwellised Particle Filtering for Dynamic Bayesian Networks , 2000, UAI.

[14]  Bo Gao,et al.  On optimization of expertise matching with various constraints , 2012, Neurocomputing.

[15]  Jie Tang,et al.  Multi-topic Based Query-Oriented Summarization , 2009, SDM.

[16]  Thomas L. Griffiths,et al.  Probabilistic author-topic models for information discovery , 2004, KDD.

[17]  David Ye,et al.  A large scale machine learning system for recommending heterogeneous content in social networks , 2011, SIGIR '11.

[18]  Nitesh V. Chawla,et al.  New perspectives and methods in link prediction , 2010, KDD.

[19]  Aleks Jakulin,et al.  Applying Discrete PCA in Data Analysis , 2004, UAI.

[20]  Juan-Zi Li,et al.  Expert Finding in a Social Network , 2007, DASFAA.

[21]  Huan Liu,et al.  Relational learning via latent social dimensions , 2009, KDD.

[22]  Chong Wang,et al.  Collaborative topic modeling for recommending scientific articles , 2011, KDD.

[23]  Li Chen,et al.  Factorization vs. regularization: fusing heterogeneous social relationships in top-n recommendation , 2011, RecSys '11.

[24]  Bart Selman,et al.  Referral Web: combining social networks and collaborative filtering , 1997, CACM.

[25]  Jure Leskovec,et al.  Predicting positive and negative links in online social networks , 2010, WWW '10.

[26]  Ruoming Jin,et al.  Topic level expertise search over heterogeneous networks , 2010, Machine Learning.

[27]  Xiaolong Zhang,et al.  CollabSeer: a search engine for collaboration discovery , 2011, JCDL '11.

[28]  Christos Faloutsos,et al.  Graph mining: Laws, generators, and algorithms , 2006, CSUR.

[29]  Ioannis Konstas,et al.  On social networks and collaborative recommendation , 2009, SIGIR.

[30]  D. Sculley,et al.  Combined regression and ranking , 2010, KDD.

[31]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[32]  Michalis Faloutsos,et al.  On power-law relationships of the Internet topology , 1999, SIGCOMM '99.

[33]  L. Asz Random Walks on Graphs: a Survey , 2022 .

[34]  Jon M. Kleinberg,et al.  The link-prediction problem for social networks , 2007, J. Assoc. Inf. Sci. Technol..