Learning Sparse Combinatorial Representations via Two-stage Submodular Maximization

We consider the problem of learning sparse representations of data sets, where the goal is to reduce a data set in manner that optimizes multiple objectives. Motivated by applications of data summarization, we develop a new model which we refer to as the two-stage submodular maximization problem. This task can be viewed as a combinatorial analogue of representation learning problems such as dictionary learning and sparse regression. The two-stage problem strictly generalizes the problem of cardinality constrained submodular maximization, though the objective function is not submodular and the techniques for submodular maximization cannot be applied. We describe a continuous optimization method which achieves an approximation ratio which asymptotically approaches 1 - 1/e. For instances where the asymptotics do not kick in, we design a local-search algorithm whose approximation ratio is arbitrarily close to 1/2. We empirically demonstrate the effectiveness of our methods on two multi-objective data summarization tasks, where the goal is to construct summaries via sparse representative subsets w.r.t. to predefined objectives.

[1]  Jan Vondrák,et al.  Maximizing a Monotone Submodular Function Subject to a Matroid Constraint , 2011, SIAM J. Comput..

[2]  Andreas Krause,et al.  Streaming submodular maximization: massive data summarization on the fly , 2014, KDD.

[3]  Jan Vondrák,et al.  Optimal approximation for the submodular welfare problem in the value oracle model , 2008, STOC.

[4]  Laurence A. Wolsey,et al.  An analysis of the greedy algorithm for the submodular set covering problem , 1982, Comb..

[5]  Hui Lin,et al.  A Class of Submodular Functions for Document Summarization , 2011, ACL.

[6]  Andreas Krause,et al.  Near-optimal Nonmyopic Value of Information in Graphical Models , 2005, UAI.

[7]  Vahab S. Mirrokni,et al.  Tight information-theoretic lower bounds for welfare maximization in combinatorial auctions , 2008, EC '08.

[8]  Hui Lin,et al.  Learning Mixtures of Submodular Shells with Application to Document Summarization , 2012, UAI.

[9]  Guillermo Sapiro,et al.  Non-Parametric Bayesian Dictionary Learning for Sparse Image Representations , 2009, NIPS.

[10]  Franz Pernkopf,et al.  Discriminative versus generative parameter and structure learning of Bayesian network classifiers , 2005, ICML.

[11]  Andreas Krause,et al.  Distributed Submodular Maximization: Identifying Representative Elements in Massive Data , 2013, NIPS.

[12]  A. Rényi On Measures of Entropy and Information , 1961 .

[13]  Alexander J. Smola,et al.  Neural Information Processing Systems , 1997, NIPS 1997.

[14]  Joseph Naor,et al.  A Unified Continuous Greedy Algorithm for Submodular Maximization , 2011, 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science.

[15]  Rishabh K. Iyer,et al.  Learning Mixtures of Submodular Functions for Image Collection Summarization , 2014, NIPS.

[16]  Guillermo Sapiro,et al.  Online dictionary learning for sparse coding , 2009, ICML '09.

[17]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[18]  Francis R. Bach,et al.  Structured sparsity-inducing norms through submodular functions , 2010, NIPS.

[19]  Andreas Krause,et al.  Greedy Dictionary Selection for Sparse Representation , 2011, IEEE Journal of Selected Topics in Signal Processing.

[20]  Andreas Krause,et al.  From MAP to Marginals: Variational Inference in Bayesian Submodular Models , 2014, NIPS.

[21]  Abhimanyu Das,et al.  Submodular meets Spectral: Greedy Algorithms for Subset Selection, Sparse Approximation and Dictionary Selection , 2011, ICML.

[22]  M. L. Fisher,et al.  An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[23]  Silviu Guiaşu,et al.  Information theory with applications , 1977 .

[24]  Christopher Potts,et al.  Learning Word Vectors for Sentiment Analysis , 2011, ACL.

[25]  Rishabh K. Iyer,et al.  Fast Multi-stage Submodular Maximization , 2014, ICML.

[26]  Andreas Krause,et al.  Lazier Than Lazy Greedy , 2014, AAAI.

[27]  Dafna Shahaf,et al.  Turning down the noise in the blogosphere , 2009, KDD.