Batch kernel SOM and related Laplacian methods for social network analysis

Large graphs are natural mathematical models for describing the structure of the data in a wide variety of fields, such as web mining, social networks, information retrieval, biological networks, etc. For all these applications, automatic tools are required to get a synthetic view of the graph and to reach a good understanding of the underlying problem. In particular, discovering groups of tightly connected vertices and understanding the relations between those groups is very important in practice. This paper shows how a kernel version of the batch self-organizing map can be used to achieve these goals via kernels derived from the Laplacian matrix of the graph, especially when it is used in conjunction with more classical methods based on the spectral analysis of the graph. The proposed method is used to explore the structure of a medieval social network modelled through a weighted graph that has been directly built from a large corpus of agrarian contracts.

[1]  Mark E. J. Newman,et al.  Structure and Dynamics of Networks , 2009 .

[2]  M. A. Muñoz,et al.  Journal of Statistical Mechanics: An IOP and SISSA journal Theory and Experiment Detecting network communities: a new systematic and efficient algorithm , 2004 .

[3]  Duncan J. Watts,et al.  The Structure and Dynamics of Networks: (Princeton Studies in Complexity) , 2006 .

[4]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[5]  Nello Cristianini,et al.  Spectral Kernel Methods for Clustering , 2001, NIPS.

[6]  Fabrice Rossi,et al.  Fast Algorithm and Implementation of Dissimilarity Self-Organizing Maps , 2006, Neural Networks.

[7]  Matthieu Latapy,et al.  Computing Communities in Large Networks Using Random Walks , 2004, J. Graph Algorithms Appl..

[8]  Alessio Micheli,et al.  A general framework for unsupervised processing of structured data , 2004, Neurocomputing.

[9]  Risi Kondor,et al.  Diffusion kernels on graphs and other discrete structures , 2002, ICML 2002.

[10]  S. Strogatz Exploring complex networks , 2001, Nature.

[11]  Satu Elisa Schaeffer,et al.  Graph Clustering , 2017, Encyclopedia of Machine Learning and Data Mining.

[12]  B. Hammer,et al.  Topographic Processing of Relational Data , 2007 .

[13]  T. Vicsek,et al.  Uncovering the overlapping community structure of complex networks in nature and society , 2005, Nature.

[14]  N. Aronszajn Theory of Reproducing Kernels. , 1950 .

[15]  John D. Lafferty,et al.  Diffusion Kernels on Graphs and Other Discrete Input Spaces , 2002, ICML.

[16]  Micah Adler,et al.  Clustering Relational Data Using Attribute and Link Information , 2003 .

[17]  Colin Fyfe,et al.  The kernel self-organising map , 2000, KES'2000. Fourth International Conference on Knowledge-Based Intelligent Engineering Systems and Allied Technologies. Proceedings (Cat. No.00TH8516).

[18]  Klaus Obermayer,et al.  A Stochastic Self-Organizing Map for Proximity Data , 1999, Neural Computation.

[19]  Klaus Obermayer,et al.  Self-organizing maps: Generalizations and new optimization techniques , 1998, Neurocomputing.

[20]  Barbara Hammer,et al.  Relational Topographic Maps , 2007, IDA.

[21]  Samuel Kaski,et al.  Comparing Self-Organizing Maps , 1996, ICANN.

[22]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[23]  Charu C. Aggarwal,et al.  Graph Clustering , 2010, Encyclopedia of Machine Learning and Data Mining.

[24]  Jie Wu,et al.  Small Worlds: The Dynamics of Networks between Order and Randomness , 2003 .

[25]  Fabrice Rossi,et al.  A comparison between dissimilarity SOM and kernel SOM for clustering the vertices of a graph , 2007 .

[26]  Michalis Faloutsos,et al.  On power-law relationships of the Internet topology , 1999, SIGCOMM '99.

[27]  David Auber,et al.  Tulip - A Huge Graph Visualization Framework , 2004, Graph Drawing Software.

[28]  Stefan Bornholdt,et al.  Handbook of Graphs and Networks: From the Genome to the Internet , 2003 .

[29]  Panu Somervuo,et al.  Self-organizing maps of symbol strings , 1998, Neurocomputing.

[30]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[31]  Alexander J. Smola,et al.  Kernels and Regularization on Graphs , 2003, COLT.

[32]  Péter András Kernel-Kohonen Networks , 2002, Int. J. Neural Syst..

[33]  B. Mohar,et al.  Eigenvalues in Combinatorial Optimization , 1993 .

[34]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[35]  Jan van den Heuvel,et al.  Using Laplacian Eigenvalues and Eigenvectors in the Analysis of Frequency Assignment Problems , 2001, Ann. Oper. Res..

[36]  Florent Hautefeuille Structures de l'habitat rural et territoires paroissiaux en bas-Quercy et haut-Toulousain du VIIème au XIVème siècle , 1999 .

[37]  Panu Somervuo,et al.  How to make large self-organizing maps for nonvectorial data , 2002, Neural Networks.

[38]  Bernhard Schölkopf,et al.  Learning Theory and Kernel Machines : 16th Annual Conference on Learning Theory and 7th Kernel Workshop, COLT/Kernel 2003, Washington, DC, USA, August 24-27, 2003. Proceedings , 2003 .

[39]  Jean-Philippe Vert,et al.  Extracting active pathways from gene expression data , 2003, ECCB.

[40]  Yves Lechevallier,et al.  Une adaptation des cartes auto-organisatrices pour des données décrites par un tableau de dissimilarités , 2007, ArXiv.

[41]  HermanIvan,et al.  Graph Visualization and Navigation in Information Visualization , 2000 .

[42]  Risto Miikkulainen,et al.  Script Recognition with Hierarchical Feature Maps , 1992 .

[43]  Shi Zhou,et al.  The rich-club phenomenon in the Internet topology , 2003, IEEE Communications Letters.

[44]  U. Feige,et al.  Spectral Graph Theory , 2015 .

[45]  Michael William Newman,et al.  The Laplacian spectrum of graphs , 2001 .

[46]  Ivan Herman,et al.  Graph Visualization and Navigation in Information Visualization: A Survey , 2000, IEEE Trans. Vis. Comput. Graph..

[47]  A. Berlinet,et al.  Reproducing kernel Hilbert spaces in probability and statistics , 2004 .

[48]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[49]  M. V. Velzen,et al.  Self-organizing maps , 2007 .

[50]  Andrew Zisserman,et al.  Advances in Neural Information Processing Systems (NIPS) , 2007 .

[51]  Andrew B. Kahng,et al.  Geometric Embeddings for Faster and Better Multi-Way Netlist Partitioning , 1993, 30th ACM/IEEE Design Automation Conference.

[52]  V. Klee,et al.  Combinatorial and graph-theoretical problems in linear algebra , 1993 .

[53]  S H Strogatz,et al.  Random graph models of social networks , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[54]  Francesco Masulli,et al.  A survey of kernel and spectral methods for clustering , 2008, Pattern Recognit..

[55]  Nathalie Villa-Vialaneix,et al.  Clustering a medieval social network by SOM using a kernel based distance measure , 2007, ESANN.

[56]  Bernhard Schölkopf,et al.  Kernel Methods in Computational Biology , 2005 .

[57]  Claudio Castellano,et al.  Defining and identifying communities in networks. , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[58]  B. Mohar THE LAPLACIAN SPECTRUM OF GRAPHS y , 1991 .

[59]  Ioannis G. Tollis,et al.  Graph Drawing , 1994, Lecture Notes in Computer Science.

[60]  Stefano Mossa,et al.  Truncation of power law behavior in "scale-free" network models due to information filtering. , 2002, Physical review letters.

[61]  Risto Mukkulainen,et al.  Script Recognition with Hierarchical Feature Maps , 1990 .

[62]  M. Newman,et al.  Mixing patterns in networks. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.