Lovász ϑ function, SVMs and finding dense subgraphs

In this paper we establish that the Lovasz ϑ function on a graph can be restated as a kernel learning problem. We introduce the notion of SVM-ϑ graphs, on which Lovasz ϑ function can be approximated well by a Support vector machine (SVM). We show that Erdos-Renyi random G(n, p) graphs are SVM-ϑ graphs for log4 n/n ≤ p < 1. Even if we embed a large clique of size Θ(√np/1-p) in a G(n, p) graph the resultant graph still remains a SVM-ϑ graph. This immediately suggests an SVM based algorithm for recovering a large planted clique in random graphs. Associated with the ϑ function is the notion of orthogonal labellings. We introduce common orthogonal labellings which extends the idea of orthogonal labellings to multiple graphs. This allows us to propose a Multiple Kernel learning (MKL) based solution which is capable of identifying a large common dense subgraph in multiple graphs. Both in the planted clique case and common subgraph detection problem the proposed solutions beat the state of the art by an order of magnitude.

[1]  Mark E. J. Newman,et al.  Structure and Dynamics of Networks , 2009 .

[2]  George Karypis,et al.  Common Pharmacophore Identification Using Frequent Clique Detection Algorithm , 2009, J. Chem. Inf. Model..

[3]  Donald E. Knuth The Sandwich Theorem , 1994, Electron. J. Comb..

[4]  Yoshimasa Takahashi,et al.  Recognition of Largest Common Structural Fragment among a Variety of Chemical Structures , 1987 .

[5]  Avi Wigderson,et al.  Public-key cryptography from different assumptions , 2010, STOC '10.

[6]  Micha Hofri,et al.  Probabilistic Analysis of Algorithms , 1987, Texts and Monographs in Computer Science.

[7]  Bo Wang,et al.  BCDNPKL: Scalable Non-Parametric Kernel Learning Using Block Coordinate Descent , 2011, ICML.

[8]  Noga Alon,et al.  Finding a large hidden clique in a random graph , 1998, SODA '98.

[9]  David R. Karger,et al.  Approximate graph coloring by semidefinite programming , 1998, JACM.

[10]  M. Trick,et al.  Cliques, Coloring, and Satisfiability: Second DIMACS Implementation Challenge, Workshop, October 11-13, 1993 , 1996 .

[11]  Ferenc Juhász,et al.  The asymptotic behaviour of lovász’ ϑ function for random graphs , 1982, Comb..

[12]  A. J. Hoffman,et al.  ON EIGENVALUES AND COLORINGS OF GRAPHS, II , 1970 .

[13]  Amin Coja-Oghlan,et al.  Exact and approximative algorithms for coloring G(n,p) , 2004, Random Struct. Algorithms.

[14]  Robert Krauthgamer,et al.  Finding and certifying a large hidden clique in a semirandom graph , 2000, Random Struct. Algorithms.

[15]  Duncan J. Watts,et al.  The Structure and Dynamics of Networks: (Princeton Studies in Complexity) , 2006 .

[16]  László Lovász,et al.  On the Shannon capacity of a graph , 1979, IEEE Trans. Inf. Theory.

[17]  David S. Johnson,et al.  Cliques, Coloring, and Satisfiability , 1996 .

[18]  Ludek Kucera,et al.  Expected Complexity of Graph Partitioning Problems , 1995, Discret. Appl. Math..

[19]  Johan Håstad,et al.  Clique is hard to approximate within n/sup 1-/spl epsiv// , 1996, Proceedings of 37th Conference on Foundations of Computer Science.

[20]  Xin-She Yang,et al.  Introduction to Algorithms , 2021, Nature-Inspired Optimization Algorithms.

[21]  M. Habib Probabilistic methods for algorithmic discrete mathematics , 1998 .

[22]  Michael Krivelevich,et al.  Deciding k-colorability in expected polynomial time , 2002, Inf. Process. Lett..

[23]  Amin Coja-Oghlan,et al.  Exact and approximative algorithms for coloring G(n,p) , 2004 .

[24]  Alexander Schrijver,et al.  Invariant Semidefinite Programs , 2010, 1007.2905.

[25]  Amin Coja-Oghlan The Lovász Number of Random Graphs , 2003, RANDOM-APPROX.

[26]  Colin McDiarmid,et al.  Algorithmic theory of random graphs , 1997 .

[27]  Charles R. Johnson,et al.  Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[28]  M. Talagrand Concentration of measure and isoperimetric inequalities in product spaces , 1994, math/9406212.

[29]  Rajiv Raman,et al.  An SDP Primal-Dual Algorithm for Approximating the Lovász-Theta Function , 2009, 2009 IEEE International Symposium on Information Theory.

[30]  Panos M. Pardalos,et al.  Computational Challenges with Cliques, Quasi-cliques and Clique Partitions in Graphs , 2010, SEA.

[31]  Kevin J. Lang,et al.  Finding dense and isolated submarkets in a sponsored search spending graph , 2007, CIKM '07.

[32]  J. Håstad Clique is hard to approximate withinn1−ε , 1999 .

[33]  Don R. Hush,et al.  QP Algorithms with Guaranteed Accuracy and Run Time for Support Vector Machines , 2006, J. Mach. Learn. Res..

[34]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[35]  Nello Cristianini,et al.  Learning the Kernel Matrix with Semidefinite Programming , 2002, J. Mach. Learn. Res..

[36]  Alexander Schrijver,et al.  A Convex Quadratic Characterization of the Lovász Theta Number , 2005, SIAM J. Discret. Math..

[37]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[38]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[39]  Alessandro Panconesi,et al.  Concentration of Measure for the Analysis of Randomized Algorithms , 2009 .

[40]  János Komlós,et al.  The eigenvalues of random symmetric matrices , 1981, Comb..

[41]  M. Sion On general minimax theorems , 1958 .

[42]  Van H. Vu,et al.  Spectral norm of random matrices , 2005, STOC '05.

[43]  Mark Jerrum,et al.  Large Cliques Elude the Metropolis Process , 1992, Random Struct. Algorithms.

[44]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[45]  Sanjeev Arora,et al.  Computational Complexity and Information Asymmetry in Financial Products (Extended Abstract) , 2010, ICS.

[46]  Charu C. Aggarwal,et al.  A Survey of Algorithms for Dense Subgraph Discovery , 2010, Managing and Mining Graph Data.

[47]  Carlos J. Luz An upper bound on the independence number of a graph computable in polynomial-time , 1995, Oper. Res. Lett..

[48]  Jiawei Han,et al.  Mining coherent dense subgraphs across massive biological networks for functional discovery , 2005, ISMB.

[49]  Robin Wilson,et al.  Modern Graph Theory , 2013 .

[50]  Jian Pei,et al.  Mining frequent cross-graph quasi-cliques , 2009, TKDD.

[51]  Willem H. Haemers,et al.  Spectra of Graphs , 2011 .

[52]  G. Lugosi,et al.  High-dimensional random geometric graphs and their clique number , 2011 .

[53]  Alan M. Frieze,et al.  Random graphs , 2006, SODA '06.

[54]  Svante Janson,et al.  Random graphs , 2000, ZOR Methods Model. Oper. Res..

[55]  L. Mirny,et al.  Protein complexes and functional modules in molecular networks , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[56]  Chiranjib Bhattacharyya,et al.  Variable Sparsity Kernel Learning , 2011, J. Mach. Learn. Res..

[57]  Ronald L. Rivest,et al.  Introduction to Algorithms, third edition , 2009 .