Graphs in machine learning. An introduction

Graphs are commonly used to characterise interactions between objects of interest. Because they are based on a straightforward formalism, they are used in many scientific fields from computer science to historical sciences. In this paper, we give an introduction to some methods relying on graphs for learning. This includes both unsupervised and supervised methods. Unsupervised learning algorithms usually aim at visualising graphs in latent spaces and/or clustering the nodes. Both focus on extracting knowledge from graph topologies. While most existing techniques are only applicable to static graphs, where edges do not evolve through time, recent developments have shown that they could be extended to deal with evolving networks. In a supervised context, one generally aims at inferring labels or numerical values attached to nodes using both the graph and, when they are available, node characteristics. Balancing the two sources of information can be challenging, especially as they can disagree locally or globally. In both contexts, supervised and un-supervised, data can be relational (augmented with one or several global graphs) as described above, or graph valued. In this latter case, each object of interest is given as a full graph (possibly completed by other characteristics). In this context, natural tasks include graph clustering (as in producing clusters of graphs rather than clusters of nodes in a single graph), graph classification, etc. 1 Real networks One of the first practical studies on graphs can be dated back to the original work of Moreno [51] in the 30s. Since then, there has been a growing interest in graph analysis associated with strong developments in the modelling and the processing of these data. Graphs are now used in many scientific fields. In Biology [54, 2, 7], for instance, metabolic networks can describe pathways of biochemical reactions [41], while in social sciences networks are used to represent relation ties between actors [66, 56, 36, 34]. Other examples include powergrids [71] and the web [75]. Recently, networks have also been considered in other areas such as geography [22] and history [59, 39]. In machine learning, networks are seen as powerful tools to model problems in order to extract information from data and for prediction purposes. This is the object of this paper. For more complete surveys, we refer to [28, 62, 49, 45]. In this section, we introduce notations and highlight properties shared by most real networks. In Section 2, we then consider methods aiming at extracting information from a unique network. We will particularly focus on clustering methods where the goal is to find clusters of vertices. Finally, in Section 3, techniques that take a series of networks into account, where each network is

[1]  Thomas L. Griffiths,et al.  Learning Systems of Concepts with an Infinite Relational Model , 2006, AAAI.

[2]  P. Latouche,et al.  Model selection and clustering in stochastic block models based on the exact integrated complete data likelihood , 2015 .

[3]  Charles Bouveyron,et al.  The random subgraph model for the analysis of an ecclesiastical network in Merovingian Gaul , 2012, 1212.5497.

[4]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2004 .

[5]  T. Snijders,et al.  Estimation and Prediction for Stochastic Blockmodels for Graphs with Latent Block Structure , 1997 .

[6]  J. Moreno Who Shall Survive: A New Approach to the Problem of Human Interrelations , 2017 .

[7]  Christophe Ambroise,et al.  Fast online graph clustering via Erdös-Rényi mixture , 2008, Pattern Recognit..

[8]  Fabrice Rossi,et al.  Exploration of a Large Database of French Notarial Acts with Social Network Methods , 2011 .

[9]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[10]  Peter D. Hoff,et al.  Latent Space Approaches to Social Network Analysis , 2002 .

[11]  S. V. N. Vishwanathan,et al.  Graph kernels , 2007 .

[12]  Jerry Ray Dias,et al.  Chemical Applications of Graph Theory , 1992 .

[13]  Bernhard Schölkopf,et al.  Uncovering the structure and temporal dynamics of information propagation , 2014, Network Science.

[14]  Barbara Hammer,et al.  Neural methods for non-standard data , 2004, ESANN.

[15]  P. Latouche,et al.  Overlapping stochastic block models with application to the French political blogosphere , 2009, 0910.2098.

[16]  Thomas Brendan Murphy,et al.  Variational Bayesian inference for the Latent Position Cluster Model , 2009, NIPS 2009.

[17]  Yihong Gong,et al.  Detecting communities and their evolutions in dynamic social networks—a Bayesian approach , 2011, Machine Learning.

[18]  Zoubin Ghahramani,et al.  Dynamic Probabilistic Models for Latent Feature Propagation in Social Networks , 2013, ICML.

[19]  Alfred O. Hero,et al.  Dynamic Stochastic Blockmodels: Statistical Models for Time-Evolving Networks , 2013, SBP.

[20]  St'ephane Robin,et al.  Uncovering latent structure in valued graphs: A variational approach , 2010, 1011.1813.

[21]  S. N. Dorogovtsev,et al.  Structure of growing networks with preferential linking. , 2000, Physical review letters.

[22]  Pierre Baldi,et al.  Graph kernels for chemical informatics , 2005, Neural Networks.

[23]  Ludovic Denoyer,et al.  Classification and annotation in social corpora using multiple relations , 2011, CIKM '11.

[24]  M E J Newman,et al.  Fast algorithm for detecting community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[25]  Alessio Micheli,et al.  A general framework for unsupervised processing of structured data , 2004, Neurocomputing.

[26]  Mark E. J. Newman,et al.  Power-Law Distributions in Empirical Data , 2007, SIAM Rev..

[27]  Scott Fortin The Graph Isomorphism Problem , 1996 .

[28]  Marie Cottrell,et al.  Neural Networks for Complex Data , 2012, KI - Künstliche Intelligenz.

[29]  Caroline Haythornthwaite,et al.  Studying Online Social Networks , 2006, J. Comput. Mediat. Commun..

[30]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[31]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[32]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[33]  Andreas Fink,et al.  Advances in Data Analysis, Data Handling and Business Intelligence: Proceedings of the 32nd Annual Conference of the Gesellschaft fr Klassifikation e.V., ... Data Analysis, and Knowledge Organization) , 2009 .

[34]  Berthold Lausen,et al.  Advances in Data Analysis, Data Handling and Business Intelligence - Proceedings of the 32nd Annual Conference of the Gesellschaft für Klassifikation e.V., Joint Conference with the British Classification Society (BCS) and the Dutch/Flemish Classification Society (VOC), Helmut-Schmidt-University, Ha , 2010, GfKl.

[35]  César Ducruet,et al.  Network diversity and maritime flows , 2013 .

[36]  Fabrice Rossi,et al.  A Triclustering Approach for Time Evolving Graphs , 2012, 2012 IEEE 12th International Conference on Data Mining Workshops.

[37]  Neil J. Hurley,et al.  Computational Statistics and Data Analysis , 2022 .

[38]  Eugene M. Luks,et al.  Isomorphism of graphs of bounded valence can be tested in polynomial time , 1980, 21st Annual Symposium on Foundations of Computer Science (sfcs 1980).

[39]  Franck Picard,et al.  A mixture model for random graphs , 2008, Stat. Comput..

[40]  Christophe Ambroise,et al.  Clustering based on random graph model embedding vertex features , 2009, Pattern Recognit. Lett..

[41]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[42]  Michaël Aupetit,et al.  High-dimensional labeled data analysis with topology representing graphs , 2005, Neurocomputing.

[43]  Cristina G. Fernandes,et al.  Motif Search in Graphs: Application to Metabolic Networks , 2006, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[44]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[45]  Fabrice Rossi,et al.  Dissemination of Health Information within Social Networks , 2012, ArXiv.

[46]  Horst Bunke,et al.  Error Correcting Graph Matching: On the Influence of the Underlying Cost Function , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[47]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[48]  N. Aronszajn Theory of Reproducing Kernels. , 1950 .

[49]  Xuelong Li,et al.  A survey of graph edit distance , 2010, Pattern Analysis and Applications.

[50]  A. Barabasi,et al.  Network biology: understanding the cell's functional organization , 2004, Nature Reviews Genetics.

[51]  Padhraic Smyth,et al.  Stochastic blockmodeling of relational event dynamics , 2013, AISTATS.

[52]  P. Latouche,et al.  Model selection and clustering in stochastic block models with the exact integrated complete data likelihood , 2013, 1303.2962.

[53]  Edoardo M. Airoldi,et al.  Mixed Membership Stochastic Blockmodels , 2007, NIPS.

[54]  Ulrik Brandes,et al.  On Modularity Clustering , 2008, IEEE Transactions on Knowledge and Data Engineering.

[55]  Charu C. Aggarwal,et al.  Graph Clustering , 2010, Encyclopedia of Machine Learning and Data Mining.

[56]  Andreas Noack,et al.  Multi-level Algorithms for Modularity Clustering , 2008, SEA.

[57]  Chris H Wiggins,et al.  Bayesian approach to network modularity. , 2007, Physical review letters.

[58]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[59]  Christophe Ambroise,et al.  Variational Bayesian inference and complexity control for stochastic block models , 2009, 0912.2873.

[60]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[61]  Edoardo M. Airoldi,et al.  A Survey of Statistical Network Models , 2009, Found. Trends Mach. Learn..

[62]  P. Bickel,et al.  A nonparametric view of network models and Newman–Girvan and other modularities , 2009, Proceedings of the National Academy of Sciences.

[63]  A. Raftery,et al.  Model‐based clustering for social networks , 2007 .

[64]  H. Bunke Graph Matching : Theoretical Foundations , Algorithms , and Applications , 2022 .

[65]  A. Balaban Chemical applications of graph theory , 1976 .

[66]  Hans-Peter Kriegel,et al.  Protein function prediction via graph kernels , 2005, ISMB.

[67]  E. Xing,et al.  A state-space mixed membership blockmodel for dynamic network tomography , 2008, 0901.0135.

[68]  Catherine Matias,et al.  MODELING HETEROGENEITY IN RANDOM GRAPHS THROUGH LATENT SPACE MODELS: A SELECTIVE REVIEW , 2014 .

[69]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[70]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[71]  Thomas Brendan Murphy,et al.  Review of statistical network analysis: models, algorithms, and software , 2012, Stat. Anal. Data Min..

[72]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[73]  Ah Chung Tsoi,et al.  Self-Organizing Maps for cyclic and unbounded graphs , 2008, ESANN.

[74]  S. Strogatz Exploring complex networks , 2001, Nature.

[75]  H E Stanley,et al.  Classes of small-world networks. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[76]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[77]  B. Hammer,et al.  Topographic Processing of Relational Data , 2007 .

[78]  T. Snijders,et al.  Estimation and Prediction for Stochastic Blockstructures , 2001 .