C C ] 1 7 M ay 2 01 5 How to refute a random CSP

Let P be a nontrivial k-ary predicate over a finite alphabet. Consider a random CSP(P ) instance I over n variables with m constraints, each being P applied to k random literals. When m ≫ n the instance I will be unsatisfiable with high probability, and the natural associated algorithmic task is to find a refutation of I — i.e., a certificate of unsatisfiability. When P is the 3-ary Boolean OR predicate, this is the well studied problem of refuting random 3-SAT formulas; in this case, an efficient algorithm is known only when m ≫ n. Understanding the density required for average-case refutation of other predicates is of importance for various areas of complexity, including cryptography, proof complexity, and learning theory. The main previously-known result is that for a general Boolean k-ary predicate P , having m ≫ n random constraints suffices for efficient refutation. In this work we give a general criterion for arbitrary k-ary predicates, one that often yields efficient refutation algorithms at much lower densities. Specifically, if P fails to support a t-wise independent (uniform) probability distribution (2 ≤ t ≤ k), then there is an efficient algorithm that refutes random CSP(P ) instances I with high probability, provided m ≫ n. Indeed, our algorithm will “somewhat strongly” refute I, certifying Opt(I) ≤ 1 − Ωk(1); if t = k then we furthermore get the strongest possible refutation, certifying Opt(I) ≤ E[P ]+ o(1). This last result is new even in the context of random k-SAT. Regarding the optimality of our m ≫ n density requirement, prior work on SDP hierarchies has given some evidence that efficient refutation of random CSP(P ) may be impossible when m≪ n. Thus there is an indication our algorithm’s dependence on m is optimal for every P , at least in the context of SDP hierarchies. Along these lines, we show that our refutation algorithm can be carried out by the O(1)-round SOS SDP hierarchy. Finally, as an application of our result, we falsify the “SRCSP assumptions” used to show various hardness-of-learning results in the recent (STOC 2014) work of Daniely, Linial, and Shalev–Shwartz. Department of Computer Science, Carnegie Mellon. {srallen,odonnell,dwitmer}@cs.cmu.edu. Supported by NSF grants CCF-0747250 and CCF-1116594. Some of this work performed while the second-named author was at the Boğaziçi University Computer Engineering Department, supported by Marie Curie International Incoming Fellowship project number 626373. The first and third named authors were partially supported by the National Science Foundation Graduate Research Fellowship Program under Grant No. DGE-1252522. 1 On refutation of random CSPs Constraint satisfaction problems (CSPs) play a major role in computer science. There is a vast theory [BJK05] of how algebraic properties of a CSP predicate affect its worst-case satisfiability complexity; there is a similarly vast theory [Rag09] of worst-case approximability of CSPs. Finally, there is a rich range of research — from the fields of computer science, mathematics, and physics — on the average-case complexity of random CSPs; see [Ach09] for a survey just of random k-SAT. This paper is concerned with random CSPs, and in particular the problem of efficiently refuting satisfiability for random instances. This is a well-studied algorithmic task with connections to, e.g., proof complexity [BB02], inapproximability [Fei02], SAT-solvers [SAT], cryptography [ABW10], learning theory [DLSS14], statistical physics [CLP02], and complexity theory [BKS13]. Historically, random CSPs are probably best studied in the case of k-SAT, k ≥ 3. The model here involves choosing a CNF formula I over n variables by drawing m clauses (ORs of k literals) independently and uniformly at random. (The precise details of the random model are inessential; see Section 3.1 for more information.) This is one of the best known efficient ways of generating hard-seeming instances of NP-complete and coNP-complete problems. The computational hardness depends crucially on the density, α = m/n. For each k there is (conjecturally) a constant critical density αk such that I is satisfiable with high probability when α < αk, and I is unsatisfiable with high probability when α > αk. (Here and throughout, “with high probability (whp)” means with probability 1 − o(1) as n → ∞.) This phenomenon occurs for all nontrivial random CSPs; in the case of k-SAT it’s been rigorously proven [DSS15] for sufficiently large k. There is a natural algorithmic task associated with the two regimes. When α < αk one wants to find a satisfying assignment for I. When α > αk one wants to refute I; i.e., find a certificate of unsatisfiability. Most heuristic SAT-solvers use DPLL-based algorithms; on unsatisfiable instances, they produce certificates that can be viewed as refutations within the Resolution proof system. More generally, a refutation algorithm for density α is any algorithm that: a) outputs “unsatisfiable” or “fail”; b) never incorrectly outputs “unsatisfiable”; c) outputs “fail” with low probability (i.e., probability o(1)).1 Empirical work suggests that as α increases towards αk, finding satisfying assignments becomes more difficult; and conversely, as α increases beyond αk, finding certificates of unsatisfiability gradually becomes easier. A seminal paper of Chvátal and Szemerédi [CS88] showed that for any sufficiently large integer c (depending on k), a random k-SAT instance with m = cn requires Resolution refutations of size 2Ω(n) (whp). On the other hand, Fu [Fu96] showed that polynomial-size Resolution refutations exist (whp) once m ≥ O(nk−1); Beame et al. [BKPS99] subsequently showed that such proofs could be found efficiently.2 A breakthrough came in 2001, when Goerdt and Krivelevich [GK01] abandoned combinatorial refutations for spectral ones, showing that random k-SAT instances can be efficiently refuted when m ≥ Õ(n⌈k/2⌉). Soon thereafter, Friedman and Goerdt [FG01] (see also [FGK05]) showed that for 3-SAT, efficient spectral refutations exist once m ≥ n3/2+ǫ (for any ǫ > 0). These densities for k-SAT — around n3/2 for 3-SAT and n⌈k/2⌉ in general — have not been fundamentally improved upon in the last 14 years.3 (See Table 1 for a more detailed history of results in this We caution the reader that in this paper we do not consider the related, but distinct, scenario of distinguishing planted random instances from truly random ones. In this paper we use the following not-fully-standard terminology: A statement of the form “If f(n) ≥ O(g(n)) then X” means that there exists a certain function h(n), with h(n) being O(g(n)), such that the statement “If f(n) ≥ h(n) then X” is true. We also use Õ(f(n)) to denote O(f(n) · polylog(f(n)), and Ok(f(n)) to denote that the hidden constant has a dependence on k (most often of the form 2). Actually, it is claimed in [GJ02] that one can obtain n for odd k “along the lines of [FG01]”. On one hand, this is true, as we’ll see in this paper. On the other hand, no proof was provided in [GJ02], and we have not found the claim repeated in any paper subsequent to 2003.

[1]  Amit Daniely,et al.  Complexity Theoretic Limitations on Learning DNF's , 2014, COLT.

[2]  Ankur Moitra,et al.  Tensor Prediction, Rademacher Complexity and Random 3-XOR , 2015, ArXiv.

[3]  Pravesh Kothari,et al.  Sum of Squares Lower Bounds from Pairwise Independence , 2015, STOC.

[4]  Allan Sly,et al.  Proof of the Satisfiability Conjecture for Large k , 2014, STOC.

[5]  Benny Applebaum,et al.  Cryptographic Hardness of Random Local Functions , 2013, computational complexity.

[6]  David Witmer,et al.  Goldreich's PRG: Evidence for Near-Optimal Polynomial Stretch , 2014, 2014 IEEE 29th Conference on Computational Complexity (CCC).

[7]  Ryan O'Donnell,et al.  Analysis of Boolean Functions , 2014, ArXiv.

[8]  David Steurer,et al.  Sum-of-squares proofs and the quest toward optimal algorithms , 2014, Electron. Colloquium Comput. Complex..

[9]  Nathan Linial,et al.  From average case complexity to improper learning complexity , 2013, STOC.

[10]  Nathan Linial,et al.  More data speeds up training time in learning halfspaces over sparse vectors , 2013, NIPS.

[11]  Madhur Tulsiani,et al.  LS+ Lower Bounds from Pairwise Independence , 2013, 2013 IEEE Conference on Computational Complexity.

[12]  Sangxia Huang Approximation resistance on satisfiable instances for predicates with few accepting inputs , 2013, STOC '13.

[13]  Siu On Chan,et al.  Approximation resistance from pairwise independent subgroups , 2013, STOC '13.

[14]  Guy Kindler,et al.  On the optimality of semidefinite relaxations for average-case and generalized constraint satisfaction , 2013, ITCS '13.

[15]  Yuan Zhou,et al.  Approximability and proof complexity , 2012, SODA.

[16]  Madhur Tulsiani,et al.  SDP Gaps from Pairwise Independence , 2012, Theory Comput..

[17]  Johan Håstad,et al.  On the Usefulness of Predicates , 2012, 2012 IEEE 27th Conference on Computational Complexity.

[18]  T. Tao Topics in Random Matrix Theory , 2012 .

[19]  Avi Wigderson,et al.  Public-key cryptography from different assumptions , 2010, STOC '10.

[20]  Madhur Tulsiani CSP gaps and reductions in the lasserre hierarchy , 2009, STOC '09.

[21]  Johan Håstad,et al.  Randomly supported independence and resistance , 2009, STOC '09.

[22]  Alan M. Frieze,et al.  An efficient sparse regularity concept , 2009, SODA.

[23]  Dimitris Achlioptas,et al.  Random Satisfiability , 2009, Handbook of Satisfiability.

[24]  P. Raghavendra,et al.  Approximating np-hard problems efficient algorithms and their limits , 2009 .

[25]  M. Laurent Sums of Squares, Moment Matrices and Optimization Over Polynomials , 2009 .

[26]  Grant Schoenebeck,et al.  Linear Level Lasserre Lower Bounds for Certain k-CSPs , 2008, 2008 49th Annual IEEE Symposium on Foundations of Computer Science.

[27]  Elchanan Mossel,et al.  Approximation Resistant Predicates from Pairwise Independence , 2008, 2008 23rd Annual IEEE Conference on Computational Complexity.

[28]  Per Austrin Conditional Inapproximability and Limited Independence , 2008 .

[29]  Uriel Feige,et al.  Refuting Smoothed 3CNF Formulas , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[30]  Noga Alon,et al.  Testing k-wise and almost k-wise independence , 2007, STOC '07.

[31]  Anup Rao,et al.  An Exposition of Bourgain's 2-Source Extractor , 2007, Electron. Colloquium Comput. Complex..

[32]  Uriel Feige,et al.  Witnesses for non-satisfiability of dense random 3CNF formulas , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[33]  Amin Coja-Oghlan,et al.  Strong Refutation Heuristics for Random k-SAT , 2006, Combinatorics, Probability and Computing.

[34]  Uriel Feige,et al.  Spectral techniques applied to sparse random graphs , 2005, Random Struct. Algorithms.

[35]  JOEL FRIEDMAN,et al.  Recognizing More Unsatisfiable Random k-SAT Instances Efficiently , 2005, SIAM J. Comput..

[36]  Peter Jeavons,et al.  Classifying the Complexity of Constraints Using Finite Algebras , 2005, SIAM J. Comput..

[37]  Amin Coja-Oghlan,et al.  Techniques from combinatorial approximation algorithms yield efficient algorithms for random 2k-SAT , 2004, Theor. Comput. Sci..

[38]  Moses Charikar,et al.  Maximizing quadratic programs: extending Grothendieck's inequality , 2004, 45th Annual IEEE Symposium on Foundations of Computer Science.

[39]  Subhash Khot,et al.  Ruling out PTAS for graph min-bisection, densest subgraph and bipartite clique , 2004, 45th Annual IEEE Symposium on Foundations of Computer Science.

[40]  Uriel Feige,et al.  Easily Refutable Subformulas of Large Random 3CNF Formulas , 2004, ICALP.

[41]  M. Alekhnovich,et al.  More on Average Case vs Approximation Complexity , 2003, 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings..

[42]  Andreas Goerdt,et al.  Recognizing more random unsatisfiable 3-SAT instances efficiently , 2003, Electron. Notes Discret. Math..

[43]  Tomasz Jurdzinski,et al.  Some Results on Random Unsatisfiable k-Sat Instances and Approximation Algorithms Applied to Random Structures , 2002, Combinatorics, Probability and Computing.

[44]  Noga Alon,et al.  Almost k-wise independence versus k-wise independence , 2003, Information Processing Letters.

[45]  U. Feige Relations between average case complexity and approximation complexity , 2002, STOC '02.

[46]  Michael E. Saks,et al.  The Efficiency of Resolution and Davis--Putnam Procedures , 2002, SIAM J. Comput..

[47]  A. Crisanti,et al.  The 3-SAT problem with large number of clauses in the ∞-replica symmetry breaking scheme , 2001, cond-mat/0108433.

[48]  Yonatan Bilu,et al.  A Gap in Average Proof Complexity , 2002, Electron. Colloquium Comput. Complex..

[49]  Dima Grigoriev,et al.  Complexity of Null-and Positivstellensatz proofs , 2001, Ann. Pure Appl. Log..

[50]  Joel Friedman,et al.  Recognizing More Unsatisfiable Random 3-SAT Instances Efficiently , 2001, ICALP.

[51]  Dima Grigoriev,et al.  Linear lower bound on degrees of Positivstellensatz calculus proofs for the parity , 2001, Theor. Comput. Sci..

[52]  Michael Krivelevich,et al.  Efficient Recognition of Random Unsatisfiable k-SAT Instances by Spectral Methods , 2001, STACS.

[53]  Jean B. Lasserre,et al.  Global Optimization with Polynomials and the Problem of Moments , 2000, SIAM J. Optim..

[54]  J. Lasserre,et al.  Optimisation globale et théorie des moments , 2000 .

[55]  P. Parrilo Structured semidefinite programs and semialgebraic geometry methods in robustness and optimization , 2000 .

[56]  Oded Goldreich,et al.  Candidate One-Way Functions Based on Expander Graphs , 2011, Studies in Complexity and Cryptography.

[57]  Eli Ben-Sasson,et al.  Short proofs are narrow-resolution made simple , 1999, Proceedings. Fourteenth Annual IEEE Conference on Computational Complexity (Formerly: Structure in Complexity Theory Conference) (Cat.No.99CB36317).

[58]  Michael E. Saks,et al.  On the complexity of unsatisfiability proofs for random k-CNF formulas , 1998, STOC '98.

[59]  Alessandro Panconesi,et al.  Approximability of Maximum Splitting of k-Sets and Some Other Apx-Complete Problems , 1996, Inf. Process. Lett..

[60]  Simon Litsyn,et al.  On Integral Zeros of Krawtchouk Polynomials , 1996, J. Comb. Theory, Ser. A.

[61]  Stephen A. Cook,et al.  On the complexity of proof systems , 1996 .

[62]  Oded Goldreich,et al.  Three XOR-Lemmas - An Exposition , 1995, Electron. Colloquium Comput. Complex..

[63]  Noga Alon,et al.  The Probabilistic Method , 2015, Fundamentals of Ramsey Theory.

[64]  Endre Szemerédi,et al.  Many hard examples for resolution , 1988, JACM.

[65]  N. Z. Shor Class of global minimum bounds of polynomial functions , 1987 .

[66]  Noga Alon,et al.  A Fast and Simple Randomized Parallel Algorithm for the Maximal Independent Set Problem , 1985, J. Algorithms.

[67]  U. Vazirani Randomness, adversaries and computation (random polynomial time) , 1986 .

[68]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, CACM.

[69]  John Franco,et al.  Probabilistic analysis of the Davis Putnam procedure for solving the satisfiability problem , 1983, Discret. Appl. Math..

[70]  János Komlós,et al.  The eigenvalues of random symmetric matrices , 1981, Comb..

[71]  F. MacWilliams,et al.  The Theory of Error-Correcting Codes , 1977 .

[72]  E. Wigner Characteristic Vectors of Bordered Matrices with Infinite Dimensions I , 1955 .