Exact MIP-based approaches for finding maximum quasi-cliques and dense subgraphs

Given a simple graph and a constant $$\gamma \in (0,1]$$γ∈(0,1], a $$\gamma $$γ-quasi-clique is defined as a subset of vertices that induces a subgraph with an edge density of at least $$\gamma $$γ. This well-known clique relaxation model arises in a variety of application domains. The maximum $$\gamma $$γ-quasi-clique problem is to find a $$\gamma $$γ-quasi-clique of maximum cardinality in the graph and is known to be NP-hard. This paper proposes new mixed integer programming (MIP) formulations for solving the maximum $$\gamma $$γ-quasi-clique problem. The corresponding linear programming (LP) relaxations are analyzed and shown to be tighter than the LP relaxations of the MIP models available in the literature on sparse graphs. The developed methodology is naturally generalized for solving the maximum $$f(\cdot )$$f(·)-dense subgraph problem, which, for a given function $$f(\cdot )$$f(·), seeks for the largest k such that there is a subgraph induced by k vertices with at least f(k) edges. The performance of the proposed exact approaches is illustrated on real-life network instances with up to 10,000 vertices.

[1]  Fan Chung Graham,et al.  The Spectra of Random Graphs with Given Expected Degrees , 2004, Internet Math..

[2]  Flavia Bonomo,et al.  Analysis and Models of Bilateral Investment Treaties Using a Social Networks Approach , 2010 .

[3]  D. Bu,et al.  Topological structure analysis of the protein-protein interaction network in budding yeast. , 2003, Nucleic acids research.

[4]  Sergiy Butenko,et al.  Clique Relaxations in Social Network Analysis: The Maximum k-Plex Problem , 2011, Oper. Res..

[5]  Sergiy Butenko,et al.  On the maximum quasi-clique problem , 2013, Discret. Appl. Math..

[6]  A. Barabasi,et al.  Hierarchical Organization of Modularity in Metabolic Networks , 2002, Science.

[7]  Mark Gerstein,et al.  Predicting interactions in protein networks by completing defective cliques , 2006, Bioinform..

[8]  Hideo Matsuda,et al.  Classifying Molecular Sequences Using a Linkage Graph With Their Pairwise Similarities , 1999, Theor. Comput. Sci..

[9]  Sergiy Butenko,et al.  Novel Approaches for Analyzing Biological Networks , 2005, J. Comb. Optim..

[10]  Laurence A. Wolsey,et al.  Integer and Combinatorial Optimization , 1988 .

[11]  Sven Kosub,et al.  The complexity of detecting fixed-density clusters , 2003, Discret. Appl. Math..

[12]  Sergiy Butenko,et al.  On clique relaxation models in network analysis , 2013, Eur. J. Oper. Res..

[13]  Jinyan Li,et al.  Mining Maximal Quasi-Bicliques to Co-Cluster Stocks and Financial Ratios for Value Investment , 2006, Sixth International Conference on Data Mining (ICDM'06).

[14]  Wilbert E. Wilhelm,et al.  Clique-detection models in computational biochemistry and genomics , 2006, Eur. J. Oper. Res..

[15]  Gary D. Bader,et al.  An automated method for finding molecular complexes in large protein interaction networks , 2003, BMC Bioinformatics.

[16]  Jiawei Han,et al.  Mining coherent dense subgraphs across massive biological networks for functional discovery , 2005, ISMB.

[17]  Fred W. Glover,et al.  Comparisons and enhancement strategies for linearizing mixed 0-1 quadratic programs , 2004, Discret. Optim..

[18]  M. Crenson,et al.  Social Networks and Political Processes in Urban Neighborhoods , 1978 .

[19]  H. Leirs,et al.  The abundance threshold for plague as a critical percolation phenomenon , 2008, Nature.

[20]  Laurence A. Wolsey,et al.  Integer and Combinatorial Optimization , 1988, Wiley interscience series in discrete mathematics and optimization.

[21]  Panos M. Pardalos,et al.  An integer programming approach for finding the most and the least central cliques , 2015, Optim. Lett..

[22]  Balabhaskar Balasundaram,et al.  A branch-and-bound approach for maximum quasi-cliques , 2014, Ann. Oper. Res..

[23]  Jon M. Kleinberg,et al.  The Lovász Theta Function and a Semidefinite Programming Relaxation of Vertex Cover , 1998, SIAM J. Discret. Math..

[24]  Timothy A. Davis,et al.  The university of Florida sparse matrix collection , 2011, TOMS.

[25]  P. Krokhmal,et al.  Dense Percolation in Large-Scale Mean-Field Random Networks Is Provably “Explosive” , 2012, PloS one.

[26]  Toshimde Ibaraki Integer programming formulation of combinatorial optimization problems , 1976, Discret. Math..

[27]  László Lovász,et al.  On the Shannon capacity of a graph , 1979, IEEE Trans. Inf. Theory.

[28]  Sanghamitra Bandyopadhyay,et al.  Mining the Largest Quasi-clique in Human Protein Interactome , 2009, 2009 International Conference on Adaptive and Intelligent Systems.

[29]  Patric R. J. Östergård,et al.  A fast algorithm for the maximum clique problem , 2002, Discret. Appl. Math..

[30]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[31]  J. Hopfield,et al.  From molecular to modular cell biology , 1999, Nature.

[32]  Christian Komusiewicz,et al.  Editing Graphs into Disjoint Unions of Dense Clusters , 2009, Algorithmica.

[33]  Sergiy Butenko,et al.  Algorithms for detecting optimal hereditary structures in graphs, with application to clique relaxations , 2013, Computational Optimization and Applications.

[34]  L. Mirny,et al.  Protein complexes and functional modules in molecular networks , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[35]  John Scott Social Network Analysis , 1988 .

[36]  R. Luce,et al.  A method of matrix analysis of group structure , 1949, Psychometrika.

[37]  Panos M. Pardalos,et al.  Mining market data: A network approach , 2006, Comput. Oper. Res..

[38]  Panos M. Pardalos,et al.  On maximum clique problems in very large graphs , 1999, External Memory Algorithms.

[39]  Panos M. Pardalos,et al.  The maximum clique problem , 1994, J. Glob. Optim..

[40]  Sandra Sudarsky,et al.  Massive Quasi-Clique Detection , 2002, LATIN.

[41]  Takeaki Uno,et al.  An Efficient Algorithm for Solving Pseudo Clique Enumeration Problem , 2008, Algorithmica.

[42]  John Scott What is social network analysis , 2010 .

[43]  Refael Hassin,et al.  Complexity of finding dense subgraphs , 2002, Discret. Appl. Math..

[44]  Panos M. Pardalos,et al.  Statistical analysis of financial networks , 2005, Comput. Stat. Data Anal..

[45]  Xintian Zhuang,et al.  A network analysis of the Chinese stock market , 2009 .