Algorithms for Approximate Minimization of the Difference Between Submodular Functions, with Applications

We extend the work of Narasimhan and Bilmes [30] for minimizing set functions representable as a dierence between submodular functions. Similar to [30], our new algorithms are guaranteed to monotonically reduce the objective function at every step. We empirically and theoretically show that the per-iteration cost of our algorithms is much less than [30], and our algorithms can be used to efficiently minimize a dierence between submodular functions under various combinatorial constraints, a problem not previously addressed. We provide computational bounds and a hardness result on the multiplicative inapproximability of minimizing the dierence between submodular functions. We show, however, that it is possible to give worst-case additive bounds by providing a polynomial time computable lower-bound on the minima. Finally we show how a number of machine learning problems can be modeled as minimizing the dierence between submodular functions. We experimentally show the validity of our algorithms by testing them on the problem of feature selection with submodular cost features.

[1]  Alexander Schrijver,et al.  A Combinatorial Algorithm Minimizing Submodular Functions in Strongly Polynomial Time , 2000, J. Comb. Theory B.

[2]  KrauseAndreas,et al.  Near-Optimal Sensor Placements in Gaussian Processes: Theory, Efficient Algorithms and Empirical Studies , 2008 .

[3]  Endre Boros,et al.  Pseudo-Boolean optimization , 2002, Discret. Appl. Math..

[4]  Andreas Krause,et al.  Near-Optimal Sensor Placements in Gaussian Processes: Theory, Efficient Algorithms and Empirical Studies , 2008, J. Mach. Learn. Res..

[5]  Jeff A. Bilmes,et al.  Submodularity beyond submodular energies: Coupling edges in graph cuts , 2011, CVPR 2011.

[6]  Jeff A. Bilmes,et al.  Online algorithms for submodular minimization with combinatorial constraints , 2010, NIPS 2010.

[7]  Kevin M. Byrnes,et al.  Maximizing General Set Functions by Submodular Decomposition , 2009, 0906.0120.

[8]  Hui Lin,et al.  Multi-document Summarization via Budgeted Maximization of Submodular Functions , 2010, NAACL.

[9]  James B. Orlin,et al.  A faster strongly polynomial time algorithm for submodular function minimization , 2007, Math. Program..

[10]  William Bialek,et al.  Synergy in a Neural Code , 2000, Neural Computation.

[11]  S. Fujishige,et al.  A Submodular Function Minimization Algorithm Based on the Minimum-Norm Base ⁄ , 2009 .

[12]  Lisa Fleischer,et al.  Submodular Approximation: Sampling-based Algorithms and Lower Bounds , 2008, 2008 49th Annual IEEE Symposium on Foundations of Computer Science.

[13]  Jeff A. Bilmes,et al.  Approximation Bounds for Inference using Cooperative Cuts , 2011, ICML.

[14]  Jeff A. Bilmes,et al.  A Submodular-supermodular Procedure with Applications to Discriminative Structure Learning , 2005, UAI.

[15]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[16]  H. B. McMahan,et al.  Robust Submodular Observation Selection , 2008 .

[17]  M. L. Fisher,et al.  An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[18]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[19]  Andrew C. Gallagher,et al.  Inference for order reduction in Markov random fields , 2011, CVPR 2011.

[20]  Hui Lin,et al.  A Class of Submodular Functions for Document Summarization , 2011, ACL.

[21]  Joseph Naor,et al.  A Tight Linear Time (1/2)-Approximation for Unconstrained Submodular Maximization , 2012, 2012 IEEE 53rd Annual Symposium on Foundations of Computer Science.

[22]  Andreas Krause,et al.  Near-optimal Nonmyopic Value of Information in Graphical Models , 2005, UAI.

[23]  Ron Kohavi,et al.  Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid , 1996, KDD.

[24]  William H. Cunningham,et al.  Decomposition of submodular functions , 1983, Comb..

[25]  Luca Trevisan,et al.  Inapproximability of Combinatorial Optimization Problems , 2004, Electron. Colloquium Comput. Complex..

[26]  Vahab S. Mirrokni,et al.  Non-monotone submodular maximization under matroid and knapsack constraints , 2009, STOC '09.

[27]  U. Feige,et al.  Maximizing Non-monotone Submodular Functions , 2011 .

[28]  N. Alon,et al.  The Probabilistic Method, Second Edition , 2000 .

[29]  Satoru Fujishige,et al.  Submodular functions and optimization , 1991 .

[30]  Noga Alon,et al.  The Probabilistic Method , 2015, Fundamentals of Ramsey Theory.

[31]  Alan L. Yuille,et al.  The Concave-Convex Procedure (CCCP) , 2001, NIPS.

[32]  Vladimir Kolmogorov,et al.  Minimizing Nonsubmodular Functions with Graph Cuts-A Review , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Jack Edmonds,et al.  Submodular Functions, Matroids, and Certain Polyhedra , 2001, Combinatorial Optimization.

[34]  Mihalis Yannakakis,et al.  Simple Local Search Problems That are Hard to Solve , 1991, SIAM J. Comput..

[35]  Jeff A. Bilmes,et al.  Online Submodular Minimization for Combinatorial Structures , 2011, ICML.

[36]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[37]  J. Bilmes,et al.  Cooperative Cuts: Graph Cuts with Submodular Edge Weights , 2010 .

[38]  Hui Lin,et al.  Optimal Selection of Limited Vocabulary Speech Corpora , 2011, INTERSPEECH.

[39]  Pat Langley,et al.  Trading Off Simplicity and Coverage in Incremental concept Learning , 1988, ML.

[40]  Yoshinobu Kawahara,et al.  Prismatic Algorithm for Discrete D.C. Programming Problem , 2011, NIPS.