CollabSeer: a search engine for collaboration discovery

Collaborative research has been increasingly popular and important in academic circles. However, there is no open platform available for scholars or scientists to effectively discover potential collaborators. This paper discusses CollabSeer, an open system to recommend potential research collaborators for scholars and scientists. CollabSeer discovers collaborators based on the structure of the coauthor network and a user's research interests. Currently, three different network structure analysis methods that use vertex similarity are supported in CollabSeer: Jaccard similarity, cosine similarity, and our relation strength similarity measure. Users can also request a recommendation by selecting a topic of interest. The topic of interest list is determined by CollabSeer's lexical analysis module, which analyzes the key phrases of previous publications. The CollabSeer system is highly modularized making it easy to add or replace the network analysis module or users' topic of interest analysis module. CollabSeer integrates the results of the two modules to recommend collaborators to users. Initial experimental results over a subset of the CiteSeerX database show that CollabSeer can efficiently discover prospective collaborators.

[1]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[2]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[3]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[4]  George Karypis,et al.  Enhancing link-based similarity through the use of non-numerical labels and prior information , 2010, MLG '10.

[5]  Ryutaro Ichise,et al.  Semantic and Event-Based Approach for Link Prediction , 2008, PAKM.

[6]  John Riedl,et al.  Item-based collaborative filtering recommendation algorithms , 2001, WWW '01.

[7]  Ronald Rousseau,et al.  Social network analysis: a powerful strategy, also for the information sciences , 2002, J. Inf. Sci..

[8]  V. Latora,et al.  Complex networks: Structure and dynamics , 2006 .

[9]  J. S. Katz,et al.  What is research collaboration , 1997 .

[10]  Ravi Kumar,et al.  Structure and evolution of online social networks , 2006, KDD '06.

[11]  Jennifer Widom,et al.  SimRank: a measure of structural-context similarity , 2002, KDD.

[12]  S. N. Dorogovtsev,et al.  Evolution of networks , 2001, cond-mat/0106144.

[13]  C. Lee Giles,et al.  Disambiguating authors in academic publications using random forests , 2009, JCDL '09.

[14]  Alfred J. Lotka,et al.  The frequency distribution of scientific productivity , 1926 .

[15]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[16]  A. Barabasi,et al.  Hierarchical Organization of Modularity in Metabolic Networks , 2002, Science.

[17]  Michael T. Gastner,et al.  The spatial structure of networks , 2006 .

[18]  M. Kendall A NEW MEASURE OF RANK CORRELATION , 1938 .

[19]  Krishna P. Gummadi,et al.  Measurement and analysis of online social networks , 2007, IMC '07.

[20]  Carl Gutwin,et al.  KEA: practical automatic keyphrase extraction , 1999, DL '99.

[21]  Xiaolong Zhang,et al.  Social network document ranking , 2010, JCDL '10.

[22]  Yizhou Sun,et al.  P-Rank: a comprehensive structural similarity measure over information networks , 2009, CIKM.

[23]  Daniel Kifer,et al.  Context-aware citation recommendation , 2010, WWW '10.

[24]  Xiaolong Zhang,et al.  SNDocRank: a social network-based video search ranking framework , 2010, MIR '10.

[25]  Ronald Rousseau,et al.  Similarity measures in scientometric research: The Jaccard index versus Salton's cosine formula , 1989, Inf. Process. Manag..

[26]  C. Lee Giles,et al.  Collaboration over time: characterizing and modeling network evolution , 2008, WSDM '08.

[27]  A. Barabasi,et al.  Scale-free characteristics of random networks: the topology of the world-wide web , 2000 .

[28]  Linyuan Lü,et al.  Predicting missing links via local information , 2009, 0901.0553.

[29]  Vipin Kumar,et al.  Introduction to Data Mining, (First Edition) , 2005 .

[30]  M. Newman Coauthorship networks and patterns of scientific collaboration , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[31]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[32]  M. Newman,et al.  Vertex similarity in networks. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[33]  M. Newman,et al.  The structure of scientific collaboration networks. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[34]  Yizhou Sun,et al.  Fast computation of SimRank for static and dynamic information networks , 2010, EDBT '10.

[35]  Gerard Salton,et al.  Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer , 1989 .

[36]  Alessandro Cucchiarelli,et al.  Mining Potential Partnership through Opportunity Discovery in Research Networks , 2010, 2010 International Conference on Advances in Social Networks Analysis and Mining.

[37]  Min-Yen Kan,et al.  Scholarly paper recommendation via user's recent research interests , 2010, JCDL '10.

[38]  A. Barabasi,et al.  Network biology: understanding the cell's functional organization , 2004, Nature Reviews Genetics.