Finding Underlying Connections: A Fast Graph-Based Method for Link Analysis and Collaboration Queries

Many techniques in the social sciences and graph theory deal with the problem of examining and analyzing patterns found in the underlying structure and associations of a group of entities. However, much of this work assumes that this underlying structure is known or can easily be inferred from data, which may often be an unrealistic assumption for many real-world problems. Below we consider the problem of learning and querying a graph-based model of this underlying structure. The model is learned from noisy observations linking sets of entities. We explicitly allow different types of links (representing different types of relations) and temporal information indicating when a link was observed. We quantitatively compare this representation and learning method against other algorithms on the task of predicting future links and new "friendships" in a variety of real world data sets.

[1]  Michael F. Schwartz,et al.  Discovering shared interests using graph analysis , 1993, CACM.

[2]  Jon M. Kleinberg,et al.  Inferring Web communities from link topology , 1998, HYPERTEXT '98.

[3]  Bart Selman,et al.  The Hidden Web , 1997, AI Mag..

[4]  David A. Cohn,et al.  The Missing Link - A Probabilistic Model of Document Content and Hypertext Connectivity , 2000, NIPS.

[5]  Eugene Garfield,et al.  Citation data as science indicators , 1978 .

[6]  William H. Press,et al.  Book-Review - Numerical Recipes in Pascal - the Art of Scientific Computing , 1989 .

[7]  F. A. Seiler,et al.  Numerical Recipes in C: The Art of Scientific Computing , 1989 .

[8]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994 .

[9]  David Harlan Wood,et al.  Discovering Shared Interests Among People Using Graph Analysis , 1993 .

[10]  Lise Getoor,et al.  Learning Probabilistic Relational Models , 1999, IJCAI.

[11]  Yiming Yang,et al.  Stochastic link and group detection , 2002, AAAI/IAAI.

[12]  M. Newman,et al.  Scientific collaboration networks. II. Shortest paths, weighted networks, and centrality. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[13]  M. Newman 1 Who is the best connected scientist ? A study of scientific coauthorship networks , 2004 .

[14]  Michael I. Jordan,et al.  Link Analysis, Eigenvectors and Stability , 2001, IJCAI.

[15]  J. Lederberg,et al.  Toward a metric of science : the advent of science indicators , 1980 .

[16]  Ben Taskar,et al.  Probabilistic Models of Text and Link Structure for Hypertext Classification , 2001 .

[17]  William H. Press,et al.  Numerical recipes in C++: the art of scientific computing, 2nd Edition (C++ ed., print. is corrected to software version 2.10) , 1994 .

[18]  Carolyn J. Anderson,et al.  A p* primer: logit models for social networks , 1999, Soc. Networks.

[19]  Joshua Lederberg,et al.  [Introduction to "Toward A Metric of Science: The Advent of Science Indicators"] , 1979 .

[20]  S. Wasserman,et al.  Logit models and logistic regressions for social networks: I. An introduction to Markov graphs andp , 1996 .

[21]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.