Combining Collective Classification and Link Prediction

The problems of object classification (labeling the nodes of a graph) and link prediction (predicting the links in a graph) have been largely studied independently. Commonly, object classification is performed assuming a complete set of known links and link prediction is done assuming a fully observed set of node attributes. In most real world domains, however, attributes and links are often missing or incorrect. Object classification is not provided with all the links relevant to correct classification and link prediction is not provided all the labels needed for accurate link prediction. In this paper, we propose an approach that addresses these two problems by interleaving object classification and link prediction in a collective algorithm. We investigate empirically the conditions under which an integrated approach to object classification and link prediction improves performance, and find that performance improves over a wide range of network types, and algorithm settings.

[1]  Thomas Hofmann,et al.  Stochastic Relational Models for Discriminative Link Prediction , 2007 .

[2]  David Liben-Nowell,et al.  The link-prediction problem for social networks , 2007 .

[3]  Matthew Richardson,et al.  Markov logic networks , 2006, Machine Learning.

[4]  Jennifer Neville,et al.  Iterative Classification in Relational Data , 2000 .

[5]  Piotr Indyk,et al.  Enhanced hypertext categorization using hyperlinks , 1998, SIGMOD '98.

[6]  Foster Provost,et al.  A Simple Relational Classifier , 2003 .

[7]  Ramesh R. Sarukkai,et al.  Link prediction and path analysis using Markov chains , 2000, Comput. Networks.

[8]  Lise Getoor,et al.  Link-Based Classification , 2003, Encyclopedia of Machine Learning and Data Mining.

[9]  Foster J. Provost,et al.  Classification in Networked Data: a Toolkit and a Univariate Case Study , 2007, J. Mach. Learn. Res..

[10]  Ben Taskar,et al.  Link Prediction in Relational Data , 2003, NIPS.

[11]  Lyle H. Ungar,et al.  Statistical Relational Learning for Link Prediction , 2003 .

[12]  Hisashi Kashima,et al.  A Parameterized Probabilistic Model of Network Evolution for Supervised Link Prediction , 2006, Sixth International Conference on Data Mining (ICDM'06).

[13]  Ben Taskar,et al.  Discriminative Probabilistic Models for Relational Data , 2002, UAI.

[14]  Michael I. Jordan,et al.  Loopy Belief Propagation for Approximate Inference: An Empirical Study , 1999, UAI.

[15]  Ben Taskar,et al.  Learning Probabilistic Models of Link Structure , 2003, J. Mach. Learn. Res..

[16]  Azriel Rosenfeld,et al.  Scene Labeling by Relaxation Operations , 1976, IEEE Transactions on Systems, Man, and Cybernetics.