The adaptive complexity of maximizing a submodular function

In this paper we study the adaptive complexity of submodular optimization. Informally, the adaptive complexity of a problem is the minimal number of sequential rounds required to achieve a constant factor approximation when polynomially-many queries can be executed in parallel at each round. Adaptivity is a fundamental concept that is heavily studied in computer science, largely due to the need for parallelizing computation. Somewhat surprisingly, very little is known about adaptivity in submodular optimization. For the canonical problem of maximizing a monotone submodular function under a cardinality constraint, to the best of our knowledge, all that is known to date is that the adaptive complexity is between 1 and Ω(n). Our main result in this paper is a tight characterization showing that the adaptive complexity of maximizing a monotone submodular function under a cardinality constraint is Θ(log n): - We describe an algorithm which requires O(log n) sequential rounds and achieves an approximation that is arbitrarily close to 1/3; - We show that no algorithm can achieve an approximation better than O(1 / log n) with fewer than O(log n / log log n) rounds. Thus, when allowing for parallelization, our algorithm achieves a constant factor approximation exponentially faster than any known existing algorithm for submodular maximization. Importantly, the approximation algorithm is achieved via adaptive sampling and complements a recent line of work on optimization of functions learned from data. In many cases we do not know the functions we optimize and learn them from labeled samples. Recent results show that no algorithm can obtain a constant factor approximation guarantee using polynomially-many labeled samples as in the PAC and PMAC models, drawn from any distribution. Since learning with non-adaptive samples over any distribution results in a sharp impossibility, we consider learning with adaptive samples where the learner obtains poly(n) samples drawn from a distribution of her choice in every round. Our result implies that in the realizable case, where there is a true underlying function generating the data, Θ(log n) batches of adaptive samples are necessary and sufficient to approximately “learn to optimize” a monotone submodular function under a cardinality constraint.

[1]  Rocco A. Servedio,et al.  Settling the Query Complexity of Non-Adaptive Junta Testing , 2017, Computational Complexity Conference.

[2]  Hui Lin,et al.  On fast approximate submodular minimization , 2011, NIPS.

[3]  Éva Tardos,et al.  Maximizing the Spread of Influence through a Social Network , 2015, Theory Comput..

[4]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[5]  Andreas Krause,et al.  Distributed Submodular Maximization: Identifying Representative Elements in Massive Data , 2013, NIPS.

[6]  Azarakhsh Malekian,et al.  Maximizing Sequence-Submodular Functions and its Application to Online Advertising , 2010, Manag. Sci..

[7]  Maria-Florina Balcan Learning Submodular Functions with Applications to Multi-Agent Systems , 2015, AAMAS.

[8]  Sergei Vassilvitskii,et al.  Fast greedy algorithms in mapreduce and streaming , 2013, SPAA.

[9]  Noam Nisan,et al.  Rounds in communication complexity revisited , 1991, STOC '91.

[10]  Richard Cole,et al.  Parallel merge sort , 1988, 27th Annual Symposium on Foundations of Computer Science (sfcs 1986).

[11]  Jan Vondrák,et al.  Tight Bounds on Low-Degree Spectral Concentration of Submodular and XOS Functions , 2015, 2015 IEEE 56th Annual Symposium on Foundations of Computer Science.

[12]  Rishabh K. Iyer,et al.  Fast Multi-stage Submodular Maximization , 2014, ICML.

[13]  S. Matthew Weinberg,et al.  Parallel algorithms for select and partition with noisy comparisons , 2016, STOC.

[14]  Eric Balkanski,et al.  The limitations of optimization from samples , 2015, STOC.

[15]  Morteza Zadimoghaddam,et al.  Bicriteria Distributed Submodular Maximization in a Few Rounds , 2017, SPAA.

[16]  Bonnie Berger,et al.  Efficient NC Algorithms for Set Cover with Applications to Learning and Geometry , 1994, J. Comput. Syst. Sci..

[17]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[18]  Robert D. Nowak,et al.  Compressive distilled sensing: Sparse recovery using adaptivity in compressive measurements , 2009, 2009 Conference Record of the Forty-Third Asilomar Conference on Signals, Systems and Computers.

[19]  Andreas Krause,et al.  Noisy Submodular Maximization via Adaptive Sampling with Applications to Crowdsourced Image Collection Summarization , 2015, AAAI.

[20]  Jarvis Haupt,et al.  Adaptive Sensing for Sparse Signal Recovery , 2009, 2009 IEEE 13th Digital Signal Processing Workshop and 5th IEEE Signal Processing Education Workshop.

[21]  Morteza Zadimoghaddam,et al.  Randomized Composable Core-sets for Distributed Submodular Maximization , 2015, STOC.

[22]  Guy E. Blelloch,et al.  Fast set operations using treaps , 1998, SPAA '98.

[23]  Vincent A Voelz,et al.  Taming the complexity of protein folding. , 2011, Current opinion in structural biology.

[24]  Laks V. S. Lakshmanan,et al.  CELF++: optimizing the greedy algorithm for influence maximization in social networks , 2011, WWW.

[25]  Tim Roughgarden,et al.  Sketching valuation functions , 2012, SODA.

[26]  Joseph K. Bradley,et al.  Parallel Double Greedy Submodular Maximization , 2014, NIPS.

[27]  Suvrit Sra,et al.  Reflection methods for user-friendly submodular optimization , 2013, NIPS.

[28]  Arpit Agarwal,et al.  Learning with Limited Rounds of Adaptivity: Coin Tossing, Multi-Armed Bandits, and Ranking from Pairwise Comparisons , 2017, COLT.

[29]  Jan Vondrák,et al.  Optimal Bounds on Approximation of Submodular and XOS Functions by Juntas , 2013, 2013 IEEE 54th Annual Symposium on Foundations of Computer Science.

[30]  Zvi Galil,et al.  Lower bounds on communication complexity , 1984, STOC '84.

[31]  U. Feige,et al.  Maximizing Non-monotone Submodular Functions , 2011 .

[32]  Noam Nisan,et al.  Economic efficiency requires interaction , 2013, STOC.

[33]  Ravi Kumar,et al.  Max-cover in map-reduce , 2010, WWW '10.

[34]  Rocco A. Servedio,et al.  Adaptivity Helps for Testing Juntas , 2015, CCC.

[35]  David C. Parkes,et al.  Learnability of Influence in Networks , 2015, NIPS.

[36]  Vijay V. Vazirani,et al.  Primal-Dual RNC Approximation Algorithms for Set Cover and Covering Integer Programs , 1999, SIAM J. Comput..

[37]  Rishabh K. Iyer,et al.  Learning Mixtures of Submodular Functions for Image Collection Summarization , 2014, NIPS.

[38]  Huy L. Nguyen,et al.  The Power of Randomization: Distributed Submodular Maximization on Massive Datasets , 2015, ICML.

[39]  Wei Chen,et al.  Efficient influence maximization in social networks , 2009, KDD.

[40]  Akram Aldroubi,et al.  Sequential adaptive compressed sampling via Huffman codes , 2008, ArXiv.

[41]  Christos H. Papadimitriou,et al.  Locally Adaptive Optimization: Adaptive Seeding for Monotone Submodular Functions , 2016, SODA.

[42]  Andreas Krause,et al.  Adaptive Submodularity: A New Approach to Active Learning and Stochastic Optimization , 2010, COLT 2010.

[43]  Clément L. Canonne,et al.  An Adaptivity Hierarchy Theorem for Property Testing , 2017, Electron. Colloquium Comput. Complex..

[44]  Matthew Richardson,et al.  Mining the network value of customers , 2001, KDD '01.

[45]  Yaron Singer,et al.  Scalable Methods for Adaptively Seeding a Social Network , 2015, WWW.

[46]  Michael I. Jordan,et al.  On the Convergence Rate of Decomposable Submodular Function Minimization , 2014, NIPS.

[47]  Sofya Raskhodnikova,et al.  A Note on Adaptivity in Testing Properties of Bounded Degree Graphs , 2006, Electron. Colloquium Comput. Complex..

[48]  Dmitry M. Malioutov,et al.  Compressed sensing with sequential observations , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[49]  M. L. Fisher,et al.  An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[50]  Wei Chen,et al.  Scalable influence maximization for prevalent viral marketing in large-scale social networks , 2010, KDD.

[51]  Leslie G. Valiant,et al.  Parallelism in Comparison Problems , 1975, SIAM J. Comput..

[52]  Guy E. Blelloch,et al.  Linear-work greedy parallel approximate set cover and variants , 2011, SPAA '11.

[53]  Peter K. Jimack,et al.  On the use of adjoint-based sensitivity estimates to control local mesh refinement , 2009 .

[54]  Maria-Florina Balcan,et al.  Learning submodular functions , 2010, ECML/PKDD.

[55]  Noga Alon,et al.  Welfare Maximization with Limited Interaction , 2015, 2015 IEEE 56th Annual Symposium on Foundations of Computer Science.

[56]  Matthew Richardson,et al.  Mining knowledge-sharing sites for viral marketing , 2002, KDD.

[57]  Guy E. Blelloch,et al.  Programming parallel algorithms , 1996, CACM.

[58]  Lior Seeman,et al.  Adaptive Seeding in Social Networks , 2013, 2013 IEEE 54th Annual Symposium on Foundations of Computer Science.

[59]  G. Seber,et al.  Adaptive Cluster Sampling , 2012 .

[60]  Pravesh Kothari,et al.  Learning Coverage Functions and Private Release of Marginals , 2014, COLT.

[61]  Guy E. Blelloch,et al.  Parallel and I/O efficient set covering algorithms , 2012, SPAA '12.

[62]  Nikhil R. Devanur,et al.  Whole-page optimization and submodular welfare maximization with online bidders , 2013, EC '13.

[63]  Eyal Kushilevitz,et al.  Communication Complexity , 1997, Adv. Comput..

[64]  David P. Woodruff,et al.  On the Power of Adaptivity in Sparse Recovery , 2011, 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science.

[65]  Maria-Florina Balcan,et al.  Learning Valuation Functions , 2011, COLT.

[66]  Ronald de Wolf,et al.  The non-adaptive query complexity of testing k-parities , 2013, Chic. J. Theor. Comput. Sci..

[67]  Peter Bro Miltersen,et al.  On data structures and asymmetric communication complexity , 1994, STOC '95.

[68]  A. Razborov Communication Complexity , 2011 .

[69]  Harvard University,et al.  Minimizing a Submodular Function from Samples , 2017 .

[70]  Andreas Krause,et al.  Distributed Submodular Cover: Succinctly Summarizing Massive Data , 2015, NIPS.

[71]  Sergei Vassilvitskii,et al.  A model of computation for MapReduce , 2010, SODA '10.

[72]  Lawrence Carin,et al.  Bayesian Compressive Sensing , 2008, IEEE Transactions on Signal Processing.

[73]  Nikhil R. Devanur,et al.  Whole-Page Optimization and Submodular Welfare Maximization with Online Bidders , 2016 .

[74]  Eric Balkanski,et al.  The Sample Complexity of Optimizing a Convex Function , 2017, COLT.

[75]  Huy L. Nguyen,et al.  A New Framework for Distributed Submodular Maximization , 2015, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS).