论文信息 - Multi-output learning via spectral filtering

Multi-output learning via spectral filtering

In this paper we study a class of regularized kernel methods for multi-output learning which are based on filtering the spectrum of the kernel matrix. The considered methods include Tikhonov regularization as a special case, as well as interesting alternatives such as vector-valued extensions of L2 boosting and other iterative schemes. Computational properties are discussed for various examples of kernels for vector-valued functions and the benefits of iterative techniques are illustrated. Generalizing previous results for the scalar case, we show a finite sample bound for the excess risk of the obtained estimator, which allows to prove consistency both for regression and multi-category classification. Finally, we present some promising results of the proposed algorithms on artificial and real data.

Lorenzo Rosasco | Luca Baldassarre | Annalisa Barla | Alessandro Verri

[1] Alexander J. Smola,et al. Kernels and Regularization on Graphs , 2003, COLT.

[2] A. Caponnetto. Optimal Rates for Regularization Operators in Learning Theory , 2006 .

[3] Bernhard Schölkopf,et al. Cluster Kernels for Semi-Supervised Learning , 2002, NIPS.

[4] Lorenzo Rosasco,et al. Adaptive Kernel Methods Using the Balancing Principle , 2010, Found. Comput. Math..

[5] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.

[6] Charles A. Micchelli,et al. On Learning Vector-Valued Functions , 2005, Neural Computation.

[7] E. Fuselier. Refined error estimates for matrix-valued radial basis functions , 2007 .

[8] J. Friedman,et al. Predicting Multivariate Responses in Multiple Linear Regression , 1997 .

[9] Mark Brudnak. Vector-Valued Support Vector Regression , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[10] Bernhard Schölkopf,et al. From Regularization Operators to Support Vector Kernels , 1997, NIPS.

[11] Robert Tibshirani,et al. The Entire Regularization Path for the Support Vector Machine , 2004, J. Mach. Learn. Res..

[12] Edwin V. Bonilla,et al. Multi-task Gaussian Process Prediction , 2007, NIPS.

[13] Gökhan BakIr,et al. Generalization Bounds and Consistency for Structured Labeling , 2007 .

[14] Michael L. Stein,et al. Interpolation of spatial data , 1999 .

[15] S. Wold,et al. The Collinearity Problem in Linear Regression. The Partial Least Squares (PLS) Approach to Generalized Inverses , 1984 .

[16] Ingo Steinwart,et al. On the Influence of the Kernel on the Consistency of Support Vector Machines , 2002, J. Mach. Learn. Res..

[17] Lorenzo Rosasco,et al. Learning from Examples as an Inverse Problem , 2005, J. Mach. Learn. Res..

[18] F. J. Narcowich,et al. Generalized Hermite interpolation via matrix-valued conditionally positive definite functions , 1994 .

[19] Ambuj Tewari,et al. On the Consistency of Multiclass Classification Methods , 2007, J. Mach. Learn. Res..

[20] Massimiliano Pontil,et al. An Algorithm for Transfer Learning in a Heterogeneous Environment , 2008, ECML/PKDD.

[21] P. Bühlmann,et al. Boosting With the L2 Loss , 2003 .

[22] Thomas Hofmann,et al. Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..

[23] J. Zidek,et al. Multivariate regression analysis and canonical variates , 1980 .

[24] B. Yu,et al. Boosting with the L_2-Loss: Regression and Classification , 2001 .

[25] Massimiliano Pontil,et al. Convex multi-task feature learning , 2008, Machine Learning.

[26] Ji Zhu,et al. Computing the Solution Path for the Regularized Support Vector Regression , 2005, NIPS.

[27] Marcus R. Frean,et al. Dependent Gaussian Processes , 2004, NIPS.

[28] André Elisseeff,et al. Stability and Generalization , 2002, J. Mach. Learn. Res..

[29] László Györfi,et al. A Probabilistic Theory of Pattern Recognition , 1996, Stochastic Modelling and Applied Probability.

[30] W. M. Wan,et al. The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD , 2011 .

[31] Charles A. Micchelli,et al. Kernels for Multi--task Learning , 2004, NIPS.

[32] Yi Lin. Multicategory Support Vector Machines, Theory, and Application to the Classification of . . . , 2003 .

[33] C. Carmeli,et al. VECTOR VALUED REPRODUCING KERNEL HILBERT SPACES OF INTEGRABLE FUNCTIONS AND MERCER THEOREM , 2006 .

[34] Peter L. Bartlett,et al. Multitask Learning with Expert Advice , 2007, COLT.

[35] H. Engl,et al. Regularization of Inverse Problems , 1996 .

[36] Lorenzo Rosasco,et al. Spectral Algorithms for Supervised Learning , 2008, Neural Computation.

[37] Peter Auer,et al. Proceedings of the 18th annual conference on Learning Theory , 2005 .

[38] T. Poggio,et al. General conditions for predictivity in learning theory , 2004, Nature.

[39] R. Rifkin,et al. Notes on Regularized Least Squares , 2007 .

[40] Rich Caruana,et al. Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[41] Eric R. Ziegel,et al. The Elements of Statistical Learning , 2003, Technometrics.

[42] Lorenzo Rosasco,et al. Vector Field Learning via Spectral Filtering , 2010, ECML/PKDD.

[43] Neil D. Lawrence,et al. Latent Force Models , 2009, AISTATS.

[44] Claudio Gentile,et al. Proceedings of the 20th annual conference on Learning theory , 2007 .

[45] Svenja Lowitzsch,et al. A density theorem for matrix-valued radial basis functions , 2005, Numerical Algorithms.

[46] G. Griffin,et al. Caltech-256 Object Category Dataset , 2007 .

[47] Michael I. Jordan,et al. Multi-task feature selection , 2006 .

[48] A. Caponnetto,et al. Optimal Rates for the Regularized Least-Squares Algorithm , 2007, Found. Comput. Math..

[49] O. Bousquet,et al. Kernels, Associated Structures and Generalizations , 2004 .

[50] Tom Heskes,et al. Task Clustering and Gating for Bayesian Multitask Learning , 2003, J. Mach. Learn. Res..

[51] Daniel Sheldon,et al. Graphical Multi-Task Learning , 2008 .

[52] E. D. Vito,et al. Risk Bounds for Regularized Least-squares Algorithm with Operator-valued kernels , 2005 .

[53] Lorenzo Rosasco,et al. On regularization algorithms in learning theory , 2007, J. Complex..

[54] David J. Fleet,et al. Performance of optical flow techniques , 1994, International Journal of Computer Vision.

[55] Jean-Philippe Vert,et al. Clustered Multi-Task Learning: A Convex Formulation , 2008, NIPS.

[56] Charles A. Micchelli,et al. Learning Multiple Tasks with Kernel Methods , 2005, J. Mach. Learn. Res..

[57] Gökhan BakIr,et al. Predicting Structured Data , 2008 .

[58] Joaquin Vanschoren,et al. EUROPEAN CONFERENCE ON MACHINE LEARNING AND PRINCIPLES AND PRACTICE OF KNOWLEDGE DISCOVERY IN DATABASES , 2012 .

[59] R. Tibshirani,et al. Least angle regression , 2004, math/0406456.

[60] Sethu Vijayakumar,et al. Multi-task Gaussian Process Learning of Robot Inverse Dynamics , 2008, NIPS.

[61] Andrew Zisserman,et al. Advances in Neural Information Processing Systems (NIPS) , 2007 .

[62] Andreas Krause,et al. Advances in Neural Information Processing Systems (NIPS) , 2014 .

[63] S. Yau. Mathematics and its applications , 2002 .

[64] Francis R. Bach,et al. A New Approach to Collaborative Filtering: Operator Estimation with Spectral Regularization , 2008, J. Mach. Learn. Res..

[65] P. Mathé,et al. MODULI OF CONTINUITY FOR OPERATOR VALUED FUNCTIONS , 2002 .

[66] Andreas Ziehe,et al. Estimating vector fields using sparse basis field expansions , 2008, NIPS.

[67] A. Izenman. Reduced-rank regression for the multivariate linear model , 1975 .

[68] Ferdinando A. Mussa-Ivaldi,et al. From basis functions to basis fields: vector field approximation from sparse data , 1992, Biological Cybernetics.

[69] Ryan M. Rifkin,et al. In Defense of One-Vs-All Classification , 2004, J. Mach. Learn. Res..

[70] Tong Zhang,et al. Statistical Analysis of Some Multi-Category Large Margin Classification Methods , 2004, J. Mach. Learn. Res..

[71] Charles A. Micchelli,et al. Universal Multi-Task Kernels , 2008, J. Mach. Learn. Res..

[72] Thomas G. Dietterich,et al. Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..

[73] Laurent Schwartz,et al. Sous-espaces hilbertiens d’espaces vectoriels topologiques et noyaux associés (Noyaux reproduisants) , 1964 .

[74] Gang Wang,et al. A New Solution Path Algorithm in Support Vector Regression , 2008, IEEE Transactions on Neural Networks.