A Brief Survey of Machine Learning Methods for Classification in Networked Data and an Application to Suspicion Scoring

This paper surveys work from the field of machine learning on the problem of within-network learning and inference. To give motivation and context to the rest of the survey, we start by presenting some (published) applications of within-network inference. After a brief formulation of this problem and a discussion of probabilistic inference in arbitrary networks, we survey machine learning work applied to networked data, along with some important predecessors--mostly from the statistics and pattern recognition literature. We then describe an application of within-network inference in the domain of suspicion scoring in social networks. We close the paper with pointers to toolkits and benchmark data sets used in machine learning research on classification in network data. We hope that such a survey will be a useful resource to workshop participants, and perhaps will be complemented by others.

[1]  David Maxwell Chickering,et al.  Dependency Networks for Inference, Collaborative Filtering, and Data Visualization , 2000, J. Mach. Learn. Res..

[2]  Matthew Richardson,et al.  Markov logic networks , 2006, Machine Learning.

[3]  Lise Getoor,et al.  Link-Based Classification , 2003, Encyclopedia of Machine Learning and Data Mining.

[4]  Foster J. Provost,et al.  Aggregation-based feature invention and relational concept classes , 2003, KDD '03.

[5]  Foster Provost,et al.  Suspicion scoring of networked entities based on guilt-by-association, collective inference, and focused data access 1 , 2005 .

[6]  M. McPherson,et al.  Birds of a Feather: Homophily in Social Networks , 2001 .

[7]  Abraham Bernstein,et al.  The Relational Vector-Space Model , 2003 .

[8]  Steven W. Zucker,et al.  On the Foundations of Relaxation Labeling Processes , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Hsinchun Chen,et al.  Applying associative retrieval techniques to alleviate the sparsity problem in collaborative filtering , 2004, TOIS.

[10]  David Jensen,et al.  Data Mining in Social Networks , 2002 .

[11]  Lise Getoor,et al.  Learning Probabilistic Relational Models , 1999, IJCAI.

[12]  Foster Provost,et al.  Suspicion scoring based on guilt-by-association, colle ctive inference, and focused data access 1 , 2005 .

[13]  Foster J. Provost,et al.  Distribution-based aggregation for relational learning with identifier attributes , 2006, Machine Learning.

[14]  Peter A. Flach,et al.  Propositionalization approaches to relational data mining , 2001 .

[15]  Ben Taskar,et al.  Discriminative Probabilistic Models for Relational Data , 2002, UAI.

[16]  Jennifer Neville,et al.  Collective Classification with Relational Dependency Networks , 2003 .

[17]  John D. Lafferty,et al.  Semi-supervised learning using randomized mincuts , 2004, ICML.

[18]  Éva Tardos,et al.  Approximation algorithms for classification problems with pairwise relationships: metric labeling and Markov random fields , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).

[19]  Jennifer Neville,et al.  Simple estimators for relational Bayesian classifiers , 2003, Third IEEE International Conference on Data Mining.

[20]  Ben Taskar,et al.  Learning associative Markov networks , 2004, ICML.

[21]  Matthew Richardson,et al.  Mining the network value of customers , 2001, KDD '01.

[22]  Peter A. Flach,et al.  IBC: A First-Order Bayesian Classifier , 1999, ILP.

[23]  Tom M. Mitchell,et al.  Learning to Extract Symbolic Knowledge from the World Wide Web , 1998, AAAI/IAAI.

[24]  P. Cohen,et al.  Is Guilt by Association a Bad Thing ? , 2005 .

[25]  Foster J. Provost,et al.  Classification in Networked Data: a Toolkit and a Univariate Case Study , 2007, J. Mach. Learn. Res..

[26]  Abraham Bernstein,et al.  Discovering Knowledge from Relational Data Extracted from Business News , 2002 .

[27]  Foster Provost,et al.  Suspicion scoring based on guilt-by-association, colle ctive inference, and focused data access 1 , 2005 .

[28]  Avi Pfeffer,et al.  Probabilistic Frame-Based Systems , 1998, AAAI/IAAI.

[29]  J. Besag On the Statistical Analysis of Dirty Pictures , 1986 .

[30]  Thorsten Joachims,et al.  Transductive Learning via Spectral Graph Partitioning , 2003, ICML.

[31]  D. Greig,et al.  Exact Maximum A Posteriori Estimation for Binary Images , 1989 .

[32]  Jennifer Neville,et al.  Dependency networks for relational data , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[33]  L. J. Savage,et al.  The Foundations of Statistics , 1955 .

[34]  Andrew McCallum,et al.  Automating the Construction of Internet Portals with Machine Learning , 2000, Information Retrieval.

[35]  P. Blau Inequality and Heterogeneity: A Primitive Theory of Social Structure , 1978 .

[36]  Tom Fawcett,et al.  Adaptive Fraud Detection , 1997, Data Mining and Knowledge Discovery.

[37]  F. Provost,et al.  Viral Marketing: Identifying Likely Adopters Via Consumer Networks , 2005 .

[38]  Corinna Cortes,et al.  Communities of interest , 2001, Intell. Data Anal..

[39]  Jennifer Neville,et al.  Linkage and Autocorrelation Cause Feature Selection Bias in Relational Learning , 2002, ICML.

[40]  Azriel Rosenfeld,et al.  Scene Labeling by Relaxation Operations , 1976, IEEE Transactions on Systems, Man, and Cybernetics.

[41]  Donald Geman,et al.  Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images , 1984 .

[42]  Jennifer Neville,et al.  Iterative Classification in Relational Data , 2000 .

[43]  V. Vapnik The Support Vector Method of Function Estimation , 1998 .

[44]  P. L. Dobruschin The Description of a Random Field by Means of Conditional Probabilities and Conditions of Its Regularity , 1968 .

[45]  Foster Provost,et al.  A Simple Relational Classifier , 2003 .

[46]  Jennifer Neville,et al.  Leveraging relational autocorrelation with latent group models , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[47]  Gerhard Winkler,et al.  Image Analysis, Random Fields and Markov Chain Monte Carlo Methods: A Mathematical Introduction , 2002 .

[48]  Ben Taskar,et al.  Probabilistic Classification and Clustering in Relational Data , 2001, IJCAI.

[49]  Lyle H. Ungar,et al.  Statistical Relational Learning for Link Prediction , 2003 .

[50]  Piotr Indyk,et al.  Enhanced hypertext categorization using hyperlinks , 1998, SIGMOD '98.

[51]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[52]  Jennifer Neville,et al.  Learning relational probability trees , 2003, KDD '03.

[53]  S. Džeroski,et al.  Relational Data Mining , 2001, Springer Berlin Heidelberg.

[54]  Daphne Koller,et al.  Genome-wide discovery of transcriptional modules from DNA sequence and gene expression , 2003, ISMB.

[55]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[56]  Jennifer Neville,et al.  Why collective inference improves relational classification , 2004, KDD.

[57]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[58]  Avrim Blum,et al.  Learning from Labeled and Unlabeled Data using Graph Mincuts , 2001, ICML.

[59]  Olga Veksler,et al.  Fast Approximate Energy Minimization via Graph Cuts , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[60]  L. D. Raedt,et al.  Three companions for data mining in first order logic , 2001 .

[61]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[62]  Éva Tardos,et al.  Approximation algorithms for classification problems with pairwise relationships: metric labeling and Markov random fields , 2002, JACM.

[63]  Tom Fawcett,et al.  Robust Classification for Imprecise Environments , 2000, Machine Learning.

[64]  Haidong Wang,et al.  Discovering molecular pathways from protein interaction and gene expression data , 2003, ISMB.

[65]  Inderjit S. Dhillon,et al.  Co-clustering documents and words using bipartite spectral graph partitioning , 2001, KDD '01.

[66]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[67]  R. B. Potts Some generalized order-disorder transformations , 1952, Mathematical Proceedings of the Cambridge Philosophical Society.

[68]  David J. Spiegelhalter,et al.  Probabilistic Networks and Expert Systems , 1999, Information Science and Statistics.

[69]  J. Besag Spatial Interaction and the Statistical Analysis of Lattice Systems , 1974 .

[70]  Jennifer Neville,et al.  Using relational knowledge discovery to prevent securities fraud , 2005, KDD '05.