Learning-Theoretic Foundations of Algorithm Configuration for Combinatorial Partitioning Problems

Max-cut, clustering, and many other partitioning problems that are of significant importance to machine learning and other scientific fields are NP-hard, a reality that has motivated researchers to develop a wealth of approximation algorithms and heuristics. Although the best algorithm to use typically depends on the specific application domain, a worst-case analysis is often used to compare algorithms. This may be misleading if worst-case instances occur infrequently, and thus there is a demand for optimization methods which return the algorithm configuration best suited for the given application's typical inputs. We address this problem for clustering, max-cut, and other partitioning problems, such as integer quadratic programming, by designing computationally efficient and sample efficient learning algorithms which receive samples from an application-specific distribution over problem instances and learn a partitioning algorithm with high expected performance. Our algorithms learn over common integer quadratic programming and clustering algorithm families: SDP rounding algorithms and agglomerative clustering algorithms with dynamic programming. For our sample complexity analysis, we provide tight bounds on the pseudodimension of these algorithm classes, and show that surprisingly, even for classes of algorithms parameterized by a single parameter, the pseudo-dimension is superconstant. In this way, our work both contributes to the foundations of algorithm configuration and pushes the boundaries of learning theory, since the algorithm classes we analyze consist of multi-stage optimization procedures and are significantly more complex than classes typically studied in learning theory.

[1]  Leonidas J. Guibas,et al.  Scalable Semidefinite Relaxation for Maximum A Posterior Estimation , 2014, ICML.

[2]  Kevin Leyton-Brown,et al.  SATzilla: Portfolio-based Algorithm Selection for SAT , 2008, J. Artif. Intell. Res..

[3]  Maria-Florina Balcan,et al.  Clustering under Perturbation Resilience , 2011, SIAM J. Comput..

[4]  Yoav Shoham,et al.  Empirical hardness models: Methodology and a case study on combinatorial auctions , 2009, JACM.

[5]  Ryan O'Donnell,et al.  An optimal sdp algorithm for max-cut, and equally optimal long code tests , 2008, STOC.

[6]  Fazli Can,et al.  Incremental clustering for dynamic information processing , 1993, TOIS.

[7]  François Laburthe,et al.  A Meta-Heuristic Factory for Vehicle Routing Problems , 1999, CP.

[8]  Charles A. Sutton,et al.  Signal Aggregate Constraints in Additive Factorial HMMs, with Application to Energy Disaggregation , 2014, NIPS.

[9]  Maria-Florina Balcan,et al.  Local algorithms for interactive clustering , 2013, ICML.

[10]  T. Tossavainen On the zeros of finite sums of exponential functions , 2006 .

[11]  Peter L. Bartlett,et al.  Neural Network Learning - Theoretical Foundations , 1999 .

[12]  Guy Kindler,et al.  Optimal inapproximability results for MAX-CUT and other 2-variable CSPs? , 2004, 45th Annual IEEE Symposium on Foundations of Computer Science.

[13]  R. Dudley The Sizes of Compact Subsets of Hilbert Space and Continuity of Gaussian Processes , 1967 .

[14]  Arkadi Nemirovski,et al.  Lectures on modern convex optimization - analysis, algorithms, and engineering applications , 2001, MPS-SIAM series on optimization.

[15]  Avrim Blum,et al.  Center-based clustering under perturbation stability , 2010, Inf. Process. Lett..

[16]  Uri Zwick,et al.  Outward rotations: a tool for rounding solutions of semidefinite programming relaxations, with applications to MAX CUT and other problems , 1999, STOC '99.

[17]  Shih-Fu Chang,et al.  Semi-supervised learning using greedy max-cut , 2013, J. Mach. Learn. Res..

[18]  Satish Rao,et al.  Using Max Cut to Enhance Rooted Trees Consistency , 2006, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[19]  Christopher D. Manning,et al.  Simple MAP Inference via Low-Rank Relaxations , 2014, NIPS.

[20]  David P. Williamson,et al.  Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming , 1995, JACM.

[21]  Mihai Pop,et al.  Alignment and clustering of phylogenetic markers - implications for microbial diversity studies , 2010, BMC Bioinformatics.

[22]  Anupam Gupta,et al.  Simpler Analyses of Local Search Algorithms for Facility Location , 2008, ArXiv.

[23]  R. Geoff Dromey,et al.  An algorithm for the selection problem , 1986, Softw. Pract. Exp..

[24]  Ken-ichi Kawarabayashi,et al.  Uncertain behaviours of integrated circuits improve computational performance , 2015, Scientific Reports.

[25]  D. Pollard Convergence of stochastic processes , 1984 .

[26]  Devdatt P. Dubhashi,et al.  Weighted Theta Functions and Embeddings with Applications to Max-Cut, Clustering and Summarization , 2015, NIPS.

[27]  Darya Filippova,et al.  Coral: an integrated suite of visualizations for comparing clusterings , 2012, BMC Bioinformatics.

[28]  S. Mansoor Sarwar,et al.  Software clustering techniques and the use of combined algorithm , 2003, Seventh European Conference onSoftware Maintenance and Reengineering, 2003. Proceedings..

[29]  M. Meilă Comparing clusterings---an information based distance , 2007 .

[30]  David P. Williamson,et al.  The Design of Approximation Algorithms , 2011 .

[31]  Aditya Bhaskara,et al.  Distributed Balanced Clustering via Mapping Coresets , 2014, NIPS.

[32]  Jianbo Shi,et al.  Balanced Graph Matching , 2006, NIPS.

[33]  Michael Langberg,et al.  The RPR2 rounding technique for semidefinite programs , 2006, J. Algorithms.

[34]  Nicolas Boumal,et al.  On the low-rank approach for semidefinite programs arising in synchronization and community detection , 2016, COLT.

[35]  Rajeev Motwani,et al.  Incremental clustering and dynamic information retrieval , 1997, STOC '97.

[36]  Moses Charikar,et al.  Maximizing quadratic programs: extending Grothendieck's inequality , 2004, 45th Annual IEEE Symposium on Foundations of Computer Science.

[37]  William Brendel,et al.  Segmentation as Maximum-Weight Independent Set , 2010, NIPS.

[38]  Tim Roughgarden,et al.  A PAC Approach to Application-Specific Algorithm Selection , 2015, SIAM J. Comput..

[39]  Heiko Röglin,et al.  Improved Analysis of Complete-Linkage Clustering , 2017, Algorithmica.

[40]  Victor Eijkhout,et al.  Self-Adapting Linear Algebra Algorithms and Software , 2005, Proceedings of the IEEE.

[41]  Yuanzhi Li,et al.  Approximate maximum entropy principles via Goemans-Williamson with applications to provable variational methods , 2016, NIPS.