Collective Classification in Network Data

Many real-world applications produce networked data such as the world-wide web (hypertext documents connected via hyperlinks), social networks (for example, people connected by friendship links), communication networks (computers connected via communication links) and biological networks (for example, protein interaction networks). A recent focus in machine learning research has been to extend traditional machine learning classification techniques to classify nodes in such networks. In this article, we provide a brief introduction to this area of research and how it has progressed during the past decade. We introduce four of the most widely used inference algorithms for classifying networked data and empirically compare them on both synthetic and real-world data.

[1]  Nevin L. Zhang,et al.  A simple approach to Bayesian network computations , 1994 .

[2]  Dale Schuurmans,et al.  Discriminative unsupervised learning of structured predictors , 2006, ICML.

[3]  W. Freeman,et al.  Generalized Belief Propagation , 2000, NIPS.

[4]  Matthew Richardson,et al.  Markov logic networks , 2006, Machine Learning.

[5]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[6]  Rina Dechter,et al.  Bucket elimination: A unifying framework for probabilistic inference , 1996, UAI.

[7]  Tom M. Mitchell,et al.  Learning to Extract Symbolic Knowledge from the World Wide Web , 1998, AAAI/IAAI.

[8]  C. Lee Giles,et al.  CiteSeer: an automatic citation indexing system , 1998, DL '98.

[9]  Lise Getoor,et al.  Learning Probabilistic Relational Models , 1999, IJCAI.

[10]  Ben Taskar,et al.  Probabilistic Classification and Clustering in Relational Data , 2001, IJCAI.

[11]  Foster J. Provost,et al.  Classification in Networked Data: a Toolkit and a Univariate Case Study , 2007, J. Mach. Learn. Res..

[12]  A. Glavieux,et al.  Near Shannon limit error-correcting coding and decoding: Turbo-codes. 1 , 1993, Proceedings of ICC '93 - IEEE International Conference on Communications.

[13]  Han Wang,et al.  Relaxation labeling of Markov random fields , 1994, Proceedings of 12th International Conference on Pattern Recognition.

[14]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[15]  Jennifer Neville,et al.  Iterative Classification in Relational Data , 2000 .

[16]  Éva Tardos,et al.  Approximation algorithms for classification problems with pairwise relationships: metric labeling and Markov random fields , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).

[17]  Martin J. Wainwright,et al.  Multitarget-multisensor data association using the tree-reweighted max-product algorithm , 2003, SPIE Defense + Commercial Sensing.

[18]  Andrew McCallum,et al.  Automating the Construction of Internet Portals with Machine Learning , 2000, Information Retrieval.

[19]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Ben Taskar,et al.  Max-Margin Markov Networks , 2003, NIPS.

[21]  Tina Eliassi-Rad,et al.  An Examination of Experimental Methodology for Classifiers of Relational Data , 2007 .

[22]  J. Besag On the Statistical Analysis of Dirty Pictures , 1986 .

[23]  Arno J. Knobbe,et al.  Propositionalisation and Aggregates , 2001, PKDD.

[24]  Lyle H. Ungar,et al.  Structural Logistic Regression for Link Analysis , 2003 .

[25]  Peter A. Flach,et al.  Comparative Evaluation of Approaches to Propositionalization , 2003, ILP.

[26]  M. Opper,et al.  Comparing the Mean Field Method and Belief Propagation for Approximate Inference in MRFs , 2001 .

[27]  Adnan Darwiche,et al.  Inference in belief networks: A procedural guide , 1996, Int. J. Approx. Reason..

[28]  William T. Freeman,et al.  Constructing free-energy approximations and generalized belief propagation algorithms , 2005, IEEE Transactions on Information Theory.

[29]  Alan L. Yuille,et al.  CCCP Algorithms to Minimize the Bethe and Kikuchi Free Energies: Convergent Alternatives to Belief Propagation , 2002, Neural Computation.

[30]  Ben Taskar,et al.  Probabilistic Models of Text and Link Structure for Hypertext Classification , 2001 .

[31]  Jung-Fu Cheng,et al.  Turbo Decoding as an Instance of Pearl's "Belief Propagation" Algorithm , 1998, IEEE J. Sel. Areas Commun..

[32]  Brendan J. Frey,et al.  Factor graphs and the sum-product algorithm , 2001, IEEE Trans. Inf. Theory.

[33]  Jennifer Neville,et al.  Why collective inference improves relational classification , 2004, KDD.

[34]  Lise Getoor,et al.  Link-Based Classification , 2003, Encyclopedia of Machine Learning and Data Mining.

[35]  Tina Eliassi-Rad,et al.  Leveraging Network Structure to Infer Missing Values in Relational Data , 2007 .

[36]  David A. Cohn,et al.  The Missing Link - A Probabilistic Model of Document Content and Hypertext Connectivity , 2000, NIPS.

[37]  Foster J. Provost,et al.  Aggregation-based feature invention and relational concept classes , 2003, KDD '03.

[38]  Ujjwal Maulik,et al.  Advanced Methods for Knowledge Discovery from Complex Data , 2005 .

[39]  S. Aji,et al.  The Generalized Distributive Law and Free Energy Minimization , 2001 .

[40]  Jennifer Neville,et al.  Bias/Variance Analysis for Relational Domains , 2007, ILP.

[41]  Rahul Gupta,et al.  Efficient inference with cardinality-based clique potentials , 2007, ICML '07.

[42]  Ben Taskar,et al.  Link Prediction in Relational Data , 2003, NIPS.

[43]  Ben Taskar,et al.  Discriminative learning of Markov random fields for segmentation of 3D scan data , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[44]  Piotr Indyk,et al.  Enhanced hypertext categorization using hyperlinks , 1998, SIGMOD '98.

[45]  William W. Cohen,et al.  On the collective classification of email "speech acts" , 2005, SIGIR '05.

[46]  Martin J. Wainwright,et al.  MAP estimation via agreement on trees: message-passing and linear programming , 2005, IEEE Transactions on Information Theory.

[47]  Brendan J. Frey,et al.  Iterative Decoding of Compound Codes by Probability Propagation in Graphical Models , 1998, IEEE J. Sel. Areas Commun..

[48]  Yiming Yang,et al.  A Study of Approaches to Hypertext Categorization , 2002, Journal of Intelligent Information Systems.

[49]  Jennifer Neville,et al.  Linkage and Autocorrelation Cause Feature Selection Bias in Relational Learning , 2002, ICML.

[50]  Ben Taskar,et al.  Learning structured prediction models: a large margin approach , 2005, ICML.

[51]  Olga Veksler,et al.  Fast Approximate Energy Minimization via Graph Cuts , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[52]  Kalyan Moy Gupta,et al.  Cautious Inference in Collective Classification , 2007, AAAI.

[53]  Tom Minka,et al.  Expectation Propagation for approximate Bayesian inference , 2001, UAI.

[54]  Sylvia Richardson,et al.  Markov Chain Monte Carlo in Practice , 1997 .

[55]  Ben Taskar,et al.  Discriminative Probabilistic Models for Relational Data , 2002, UAI.

[56]  Yair Weiss,et al.  Approximate Inference and Protein-Folding , 2002, NIPS.

[57]  M. McPherson,et al.  Birds of a Feather: Homophily in Social Networks , 2001 .

[58]  Steven W. Zucker,et al.  On the Foundations of Relaxation Labeling Processes , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[59]  Sofus A. Macskassy Improving Learning in Networked Data by Combining Explicit and Mined Links , 2007, AAAI.

[60]  Jennifer Neville,et al.  Relational Dependency Networks , 2007, J. Mach. Learn. Res..

[61]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[62]  Hilbert J. Kappen,et al.  Validity Estimates for Loopy Belief Propagation on Binary Real-world Networks , 2004, NIPS.

[63]  Mark Craven,et al.  Combining Statistical and Relational Methods for Learning in Hypertext Domains , 1998, ILP.

[64]  Foster J. Provost,et al.  Learning and Inference in Massive Social Networks , 2007, MLG.

[65]  Foster J. Provost,et al.  Distribution-based aggregation for relational learning with identifier attributes , 2006, Machine Learning.

[66]  Peter A. Flach,et al.  Propositionalization approaches to relational data mining , 2001 .