Applications of strong convexity--strong smoothness duality to learning with matrices

It is known that a function is strongly convex with respect to some norm if and only if its conjugate function is strongly smooth with respect to the dual norm. This result has already been found to be a key component in deriving and analyzing several learning algorithms. Utilizing this du-ality, we isolate a single inequality which seamlessly implies both generalization bounds and on-line regret bounds; and we show how to construct strongly convex functions over matrices based on strongly convex functions over vectors. The newly constructed functions (over matrices) inherit the strong convexity properties of the underlying vector functions. We demonstrate the potential of this framework by analyzing several learning algorithms including group Lasso, kernel learning, and online control with adversarial quadratic costs.

[1]  G. Pisier Martingales with values in uniformly convex spaces , 1975 .

[2]  丸山 徹 Convex Analysisの二,三の進展について , 1977 .

[3]  I. Pinelis OPTIMUM BOUNDS FOR THE DISTRIBUTIONS OF MARTINGALES IN BANACH SPACES , 1994, 1208.2200.

[4]  A. Lewis The Convex Analysis of Unitarily Invariant Matrix Functions , 1995 .

[5]  Dale Schuurmans,et al.  General Convergence Results for Linear Discriminant Updates , 1997, COLT '97.

[6]  Manfred K. Warmuth,et al.  Exponentiated Gradient Versus Gradient Descent for Linear Predictors , 1997, Inf. Comput..

[7]  Claudio Gentile,et al.  The Robustness of the p-Norm Algorithms , 1999, COLT '99.

[8]  Adrian S. Lewis,et al.  Convex Analysis And Nonlinear Optimization , 2000 .

[9]  C. Zălinescu Convex analysis in general vector spaces , 2002 .

[10]  Ron Meir,et al.  Generalization Error Bounds for Bayesian Mixture Algorithms , 2003, J. Mach. Learn. Res..

[11]  Nello Cristianini,et al.  Learning the Kernel Matrix with Semidefinite Programming , 2002, J. Mach. Learn. Res..

[12]  Manfred K. Warmuth,et al.  Relative Loss Bounds for Multidimensional Regression Problems , 1997, Machine Learning.

[13]  Yoram Singer,et al.  Convex Repeated Games and Fenchel Duality , 2006, NIPS.

[14]  Shai Ben-David,et al.  Learning Bounds for Support Vector Machines with Learned Kernels , 2006, COLT.

[15]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[16]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[17]  Yoram Singer,et al.  A primal-dual perspective of online learning algorithms , 2007, Machine Learning.

[18]  Shai Shalev-Shwartz,et al.  Online learning: theory, algorithms and applications (למידה מקוונת.) , 2007 .

[19]  Peter L. Bartlett,et al.  Matrix regularization techniques for online multitask learning , 2008 .

[20]  Ambuj Tewari,et al.  On the Complexity of Linear Prediction: Risk Bounds, Margin Bounds, and Regularization , 2008, NIPS.

[21]  A. Juditsky,et al.  Large Deviations of Vector-valued Martingales in 2-Smooth Normed Spaces , 2008, 0809.0813.

[22]  Claudio Gentile,et al.  Linear Algorithms for Online Multitask Classification , 2010, COLT.

[23]  Yoram Singer,et al.  On the equivalence of weak learnability and linear separability: new relaxations and efficient boosting algorithms , 2010, Machine Learning.

[24]  Manfred K. Warmuth,et al.  Online variance minimization , 2011, Machine Learning.