Greedy Algorithms for Structurally Constrained High Dimensional Problems

A hallmark of modern machine learning is its ability to deal with high dimensional problems by exploiting structural assumptions that limit the degrees of freedom in the underlying model. A deep understanding of the capabilities and limits of high dimensional learning methods under specific assumptions such as sparsity, group sparsity, and low rank has been attained. Efforts [1, 2] are now underway to distill this valuable experience by proposing general unified frameworks that can achieve the twin goals of summarizing previous analyses and enabling their application to notions of structure hitherto unexplored. Inspired by these developments, we propose and analyze a general computational scheme based on a greedy strategy to solve convex optimization problems that arise when dealing with structurally constrained high-dimensional problems. Our framework not only unifies existing greedy algorithms by recovering them as special cases but also yields novel ones. Finally, we extend our results to infinite dimensional settings by using interesting connections between smoothness of norms and behavior of martingales in Banach spaces.

[1]  Philip Wolfe,et al.  An algorithm for quadratic programming , 1956 .

[2]  G. Pisier Martingales with values in uniformly convex spaces , 1975 .

[3]  C. Darken,et al.  Constructive Approximation Rates of Convex Approximation in Non-hilbert Spaces , 2022 .

[4]  Tong Zhang,et al.  Sequential greedy approximation for certain convex optimization problems , 2003, IEEE Trans. Inf. Theory.

[5]  Robert E. Schapire,et al.  The Boosting Approach to Machine Learning An Overview , 2003 .

[6]  A. Pietsch History of Banach Spaces and Linear Operators , 2007 .

[7]  J. Borwein,et al.  Uniformly convex functions on Banach spaces , 2008 .

[8]  Kenneth L. Clarkson,et al.  Coresets, sparse greedy approximation, and the Frank-Wolfe algorithm , 2008, SODA '08.

[9]  Martin J. Wainwright,et al.  A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers , 2009, NIPS.

[10]  Xi Chen,et al.  Nonparametric Greedy Algorithms for the Sparse Learning Problem , 2009, NIPS.

[11]  Naoki Abe,et al.  Grouped Orthogonal Matching Pursuit for Variable Selection and Prediction , 2009, NIPS.

[12]  Tong Zhang,et al.  Trading Accuracy for Sparsity in Optimization Problems with Sparsity Constraints , 2010, SIAM J. Optim..

[13]  Martin J. Wainwright,et al.  Fast global convergence rates of gradient methods for high-dimensional statistical recovery , 2010, NIPS.

[14]  Naoki Abe,et al.  Group Orthogonal Matching Pursuit for Logistic Regression , 2011, AISTATS.

[15]  Vladimir Temlyakov,et al.  CAMBRIDGE MONOGRAPHS ON APPLIED AND COMPUTATIONAL MATHEMATICS , 2022 .

[16]  Ohad Shamir,et al.  Large-Scale Convex Minimization with a Low-Rank Constraint , 2011, ICML.

[17]  Abhimanyu Das,et al.  Submodular meets Spectral: Greedy Algorithms for Subset Selection, Sparse Approximation and Dictionary Selection , 2011, ICML.

[18]  Pablo A. Parrilo,et al.  The Convex Geometry of Linear Inverse Problems , 2010, Foundations of Computational Mathematics.

[19]  Martin Jaggi,et al.  Revisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization , 2013, ICML.

[20]  Christopher J. Hillar,et al.  Most Tensor Problems Are NP-Hard , 2009, JACM.