Triangle minimization in large networks

The number of triangles is a fundamental metric for analyzing the structure and function of a network. In this paper, for the first time, we investigate the triangle minimization problem in a network under edge (node) attack, where the attacker aims to minimize the number of triangles in the network by removing $$k$$k edges (nodes). We show that the triangle minimization problem under edge (node) attack is a submodular function maximization problem, which can be solved efficiently. Specifically, we propose a degree-based edge (node) removal algorithm and a near-optimal greedy edge (node) removal algorithm for approximately solving the triangle minimization problem under edge (node) attack. In addition, we introduce two pruning strategies and an approximate marginal gain evaluation technique to further speed up the greedy edge (node) removal algorithm. We conduct extensive experiments over 12 real-world datasets to evaluate the proposed algorithms, and the results demonstrate the effectiveness, efficiency and scalability of our algorithms.

[1]  Ziv Bar-Yossef,et al.  Reductions in streaming algorithms, with an application to counting triangles in graphs , 2002, SODA '02.

[2]  R. Hanneman Introduction to Social Network Methods , 2001 .

[3]  Matthieu Latapy,et al.  Main-memory triangle computations for very large (sparse (power-law)) graphs , 2008, Theor. Comput. Sci..

[4]  Jeffrey Xu Yu,et al.  Scalable Diversified Ranking on Large Graphs , 2011, IEEE Transactions on Knowledge and Data Engineering.

[5]  Michel Minoux,et al.  Accelerated greedy algorithms for maximizing submodular set functions , 1978 .

[6]  Hong Cheng,et al.  Random-walk domination in large graphs , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[7]  Hong Cheng,et al.  Measuring the impact of MVC attack in large complex networks , 2014, Inf. Sci..

[8]  Christos Faloutsos,et al.  DOULION: counting triangles in massive graphs with a coin , 2009, KDD.

[9]  Hui Lin,et al.  A Class of Submodular Functions for Document Summarization , 2011, ACL.

[10]  Cohen,et al.  Resilience of the internet to random breakdowns , 2000, Physical review letters.

[11]  Sebastian Wernicke,et al.  Combinatorial network abstraction by trees and distances , 2005, Theor. Comput. Sci..

[12]  Michalis Faloutsos,et al.  Gelling, and melting, large graphs by edge manipulation , 2012, CIKM.

[13]  Gordon F. Royle,et al.  Algebraic Graph Theory , 2001, Graduate texts in mathematics.

[14]  Vijay V. Vazirani,et al.  Approximation Algorithms , 2001, Springer Berlin Heidelberg.

[15]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[16]  Hong Cheng,et al.  Measuring robustness of complex networks under MVC attack , 2012, CIKM.

[17]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[18]  Philippe Flajolet,et al.  Loglog Counting of Large Cardinalities (Extended Abstract) , 2003, ESA.

[19]  Hui Lin,et al.  Multi-document Summarization via Budgeted Maximization of Submodular Functions , 2010, NAACL.

[20]  Christian Sohler,et al.  Counting triangles in data streams , 2006, PODS.

[21]  Thomas Schank,et al.  Algorithmic Aspects of Triangle-Based Network Analysis , 2007 .

[22]  Christos Faloutsos,et al.  On the Vulnerability of Large Graphs , 2010, 2010 IEEE International Conference on Data Mining.

[23]  Hans J. Herrmann,et al.  Mitigation of malicious attacks on networks , 2011, Proceedings of the National Academy of Sciences.

[24]  P. Flajolet,et al.  HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm , 2007 .

[25]  Andreas Krause,et al.  Near-optimal Observation Selection using Submodular Functions , 2007, AAAI.

[26]  James Cheng,et al.  Triangle listing in massive networks and its applications , 2011, KDD.

[27]  H. Avron Counting Triangles in Large Graphs using Randomized Matrix Trace Estimation , 2010 .

[28]  Éva Tardos,et al.  Maximizing the Spread of Influence through a Social Network , 2015, Theory Comput..

[29]  Andreas Krause,et al.  Near-Optimal Sensor Placements in Gaussian Processes: Theory, Efficient Algorithms and Empirical Studies , 2008, J. Mach. Learn. Res..

[30]  Jan Vondr Submodularity and Curvature: The Optimal Algorithm , 2010 .

[31]  M. McPherson,et al.  Birds of a Feather: Homophily in Social Networks , 2001 .

[32]  Alon Itai,et al.  Finding a minimum circuit in a graph , 1977, STOC '77.

[33]  Sergei Vassilvitskii,et al.  Counting triangles and the curse of the last reducer , 2011, WWW.

[34]  Andreas Krause,et al.  Cost-effective outbreak detection in networks , 2007, KDD '07.

[35]  Luca Becchetti,et al.  Efficient semi-streaming algorithms for local triangle counting in massive graphs , 2008, KDD.

[36]  M. L. Fisher,et al.  An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[37]  Noga Alon,et al.  Finding and counting given length cycles , 1997, Algorithmica.

[38]  J. Coleman,et al.  Social Capital in the Creation of Human Capital , 1988, American Journal of Sociology.

[39]  Jeffrey Xu Yu,et al.  Scalable Diversified Ranking on Large Graphs , 2013, IEEE Trans. Knowl. Data Eng..

[40]  Dorit S. Hochbaum,et al.  Approximation Algorithms for NP-Hard Problems , 1996 .

[41]  Philippe Flajolet,et al.  Probabilistic Counting Algorithms for Data Base Applications , 1985, J. Comput. Syst. Sci..

[42]  Dorothea Wagner,et al.  Finding, Counting and Listing All Triangles in Large Graphs, an Experimental Study , 2005, WEA.

[43]  Massimo Marchiori,et al.  Error and attacktolerance of complex network s , 2004 .

[44]  D S Callaway,et al.  Network robustness and fragility: percolation on random graphs. , 2000, Physical review letters.

[45]  Christos Faloutsos,et al.  ANF: a fast and scalable tool for data mining in massive graphs , 2002, KDD.

[46]  Andreas Krause,et al.  A Utility-Theoretic Approach to Privacy and Personalization , 2008, AAAI.

[47]  Mohammad Ghodsi,et al.  New Streaming Algorithms for Counting Triangles in Graphs , 2005, COCOON.

[48]  Tamara G. Kolda,et al.  Fast Triangle Counting through Wedge Sampling , 2012, ArXiv.