Distance metric learning for complex networks: towards size-independent comparison of network structures.

Real networks show nontrivial topological properties such as community structure and long-tail degree distribution. Moreover, many network analysis applications are based on topological comparison of complex networks. Classification and clustering of networks, model selection, and anomaly detection are just some applications of network comparison. In these applications, an effective similarity metric is needed which, given two complex networks of possibly different sizes, evaluates the amount of similarity between the structural features of the two networks. Traditional graph comparison approaches, such as isomorphism-based methods, are not only too time consuming but also inappropriate to compare networks with different sizes. In this paper, we propose an intelligent method based on the genetic algorithms for integrating, selecting, and weighting the network features in order to develop an effective similarity measure for complex networks. The proposed similarity metric outperforms state of the art methods with respect to different evaluation criteria.

[1]  Priya Mahadevan,et al.  Systematic topology analysis and generation using degree correlations , 2006, SIGCOMM.

[2]  Edoardo M. Airoldi,et al.  Network sampling and classification: An investigation of network model representations , 2011, Decis. Support Syst..

[3]  Hawoong Jeong,et al.  Statistical properties of sampled networks. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[4]  Natasa Przulj,et al.  Biological network comparison using graphlet degree distribution , 2007, Bioinform..

[5]  Kurt Mehlhorn,et al.  Efficient graphlet kernels for large graph comparison , 2009, AISTATS.

[6]  Alexander K. Kelmans Comparison of graphs by their number of spanning trees , 1976, Discret. Math..

[7]  Alessandro Vespignani,et al.  Epidemic dynamics in finite size scale-free networks. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[8]  Rok Sosic,et al.  SNAP , 2016, ACM Trans. Intell. Syst. Technol..

[9]  Philippe Blanchard,et al.  An algorithm generating random graphs with power law degree distributions , 2002 .

[10]  Paul Erdős,et al.  ON THE CENTRAL LIMIT THEOREM FOR SAMPLES FROM A FINITE POPULATION , 2004 .

[11]  Andrea Montanari,et al.  The spread of innovations in social networks , 2010, Proceedings of the National Academy of Sciences.

[12]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[13]  E. M. Bollt,et al.  Portraits of complex networks , 2008 .

[14]  H. Kashima,et al.  Kernels for graphs , 2004 .

[15]  Chris Hankin,et al.  Fast Multi-Scale Detection of Relevant Communities in Large-Scale Networks , 2013, Comput. J..

[16]  M. Vidal,et al.  Effect of sampling on topology predictions of protein-protein interaction networks , 2005, Nature Biotechnology.

[17]  Massimo Marchiori,et al.  Vulnerability and protection of infrastructure networks. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[18]  Alexander Mehler,et al.  STRUCTURAL SIMILARITIES OF COMPLEX NETWORKS: A COMPUTATIONAL MODEL BY EXAMPLE OF WIKI GRAPHS , 2008, Appl. Artif. Intell..

[19]  Jeannette C. M. Janssen,et al.  Model Selection for Social Networks Using Graphlets , 2012, Internet Math..

[20]  Ping Zhu,et al.  A study of graph spectra for comparing graphs and trees , 2008, Pattern Recognit..

[21]  L. da F. Costa,et al.  Characterization of complex networks: A survey of measurements , 2005, cond-mat/0505185.

[22]  Christos Faloutsos,et al.  Sampling from large graphs , 2006, KDD '06.

[23]  Aleksandar Stevanovic,et al.  GraphCrunch 2: Software tool for network modeling, alignment and clustering , 2011, BMC Bioinformatics.

[24]  Patrick Lincoln,et al.  Epidemic profiles and defense of scale-free networks , 2003, WORM '03.

[25]  Ben Y. Zhao,et al.  Measurement-calibrated graph models for social network experiments , 2010, WWW '10.

[26]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[27]  Thomas Gärtner,et al.  On Graph Kernels: Hardness Results and Efficient Alternatives , 2003, COLT.

[28]  Mohammad Al Hasan,et al.  GRAFT: an approximate graphlet counting algorithm for large graph analysis , 2012, CIKM.

[29]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[30]  Ling Huang,et al.  Evolution of social-attribute networks: measurements, modeling, and implications using google+ , 2012, Internet Measurement Conference.

[31]  Ngoc Thanh Nguyen,et al.  Agent-Based Approach for Distributed Intrusion Detection System Design , 2006, International Conference on Computational Science.

[32]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[33]  Priya Mahadevan,et al.  Systematic topology analysis and generation using degree correlations , 2006, SIGCOMM 2006.

[34]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[35]  James C. Bezdek,et al.  Some new indexes of cluster validity , 1998, IEEE Trans. Syst. Man Cybern. Part B.

[36]  Igor Jurisica,et al.  Modeling interactome: scale-free or geometric? , 2004, Bioinform..

[37]  Lynne Hamill,et al.  Social Circles: A Simple Structure for Agent-Based Social Network Models , 2009, J. Artif. Soc. Soc. Simul..

[38]  J. C. Dunn,et al.  A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters , 1973 .

[39]  Carsten Wiuf,et al.  Subnets of scale-free networks are not scale-free: sampling properties of networks. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[40]  Danai Koutra,et al.  Network similarity via multiple social theories , 2013, 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2013).

[41]  M. McPherson,et al.  Birds of a Feather: Homophily in Social Networks , 2001 .

[42]  Sadegh Aliakbary,et al.  Quantification and comparison of degree distributions in complex networks , 2013, 7'th International Symposium on Telecommunications (IST'2014).

[43]  S. Shen-Orr,et al.  Network motifs: simple building blocks of complex networks. , 2002, Science.

[44]  George C. Verghese,et al.  Graph similarity scoring and matching , 2008, Appl. Math. Lett..

[45]  Shi Zhou,et al.  The rich-club phenomenon in the Internet topology , 2003, IEEE Communications Letters.

[46]  Christos Faloutsos,et al.  Graphs over time: densification laws, shrinking diameters and possible explanations , 2005, KDD '05.

[47]  Carsten Wiuf,et al.  Sampling properties of random graphs: the degree distribution. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[48]  V Latora,et al.  Small-world behavior in time-varying graphs. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[49]  Christos Faloutsos,et al.  Kronecker Graphs: An Approach to Modeling Networks , 2008, J. Mach. Learn. Res..

[50]  Gábor Csárdi,et al.  The igraph software package for complex network research , 2006 .

[51]  E. Ziv,et al.  Inferring network mechanisms: the Drosophila melanogaster protein interaction network. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[52]  S. V. N. Vishwanathan,et al.  Graph kernels , 2007 .

[53]  Jafar Habibi,et al.  Feature Extraction from Degree Distribution for Comparison and Analysis of Complex Networks , 2014, Comput. J..

[54]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[55]  Katherine Faust,et al.  Comparing Social Networks: Size, Density, and Local Structure , 2006 .

[56]  Hector Garcia-Molina,et al.  Web graph similarity for anomaly detection , 2010, Journal of Internet Services and Applications.

[57]  Aristides Gionis,et al.  Mining Large Networks with Subgraph Counting , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[58]  Danai Koutra,et al.  DELTACON: A Principled Massive-Graph Similarity Function , 2013, SDM.

[59]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[60]  Karsten M. Borgwardt,et al.  The graphlet spectrum , 2009, ICML '09.

[61]  M E J Newman Assortative mixing in networks. , 2002, Physical review letters.

[62]  Jimeng Sun,et al.  Fast Random Walk Graph Kernel , 2012, SDM.

[63]  Jon M. Kleinberg,et al.  Navigation in a small world , 2000, Nature.

[64]  Jafar Habibi,et al.  GMSCN: Generative Model Selection Using a Scalable and Size-Independent Complex Network Classifier , 2013, Chaos.

[65]  Jari Saramäki,et al.  Temporal Networks , 2011, Encyclopedia of Social Network Analysis and Mining.

[66]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[67]  Rong Jin,et al.  Distance Metric Learning: A Comprehensive Survey , 2006 .

[68]  Jukka-Pekka Onnela,et al.  Taxonomies of networks from community structure. , 2012, Physical review. E, Statistical, nonlinear, and soft matter physics.

[69]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.