Operator-valued Kernels for Learning from Functional Response Data

In this paper we consider the problems of supervised classification and regression in the case where attributes and labels are functions: a data is represented by a set of functions, and the label is also a function. We focus on the use of reproducing kernel Hilbert space theory to learn from such functional data. Basic concepts and properties of kernel-based learning are extended to include the estimation of function-valued functions. In this setting, the representer theorem is restated, a set of rigorously defined infinite-dimensional operator-valued kernels that can be valuably applied when the data are functions is described, and a learning algorithm for nonlinear functional data analysis is introduced. The methodology is illustrated through speech and audio signal processing experiments.

[1]  Vittorio Murino,et al.  A unifying framework for vector-valued manifold regularization and multi-view learning , 2013, ICML.

[2]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[3]  Hans Burkhardt,et al.  Learning Equivariant Functions with Matrix Valued Kernels , 2007, J. Mach. Learn. Res..

[4]  Massimiliano Pontil,et al.  Regularized multi--task learning , 2004, KDD.

[5]  Stanley R. Johnson,et al.  Varying Coefficient Models , 1984 .

[6]  André Elisseeff,et al.  Stability and Generalization , 2002, J. Mach. Learn. Res..

[7]  Charles A. Micchelli,et al.  Kernels for Multi--task Learning , 2004, NIPS.

[8]  John Shawe-Taylor,et al.  Multiclass and Multiview Learning at One-class Complexity , 2005 .

[9]  C. Carmeli,et al.  VECTOR VALUED REPRODUCING KERNEL HILBERT SPACES OF INTEGRABLE FUNCTIONS AND MERCER THEOREM , 2006 .

[10]  C. Carmeli,et al.  Vector valued reproducing kernel Hilbert spaces and universality , 2008, 0807.1659.

[11]  Elliot Saltzman,et al.  Retrieving Tract Variables From Acoustics: A Comparison of Different Machine Learning Strategies , 2010, IEEE Journal of Selected Topics in Signal Processing.

[12]  M. Zabarankin,et al.  Convex functional analysis , 2005 .

[13]  Tong Zhang,et al.  A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , 2005, J. Mach. Learn. Res..

[14]  Florence d'Alché-Buc,et al.  Semi-supervised Penalized Output Kernel Regression for Link Prediction , 2011, ICML.

[15]  H. Müller,et al.  FUNCTIONAL RESPONSE MODELS , 2004 .

[16]  H. Lian Nonlinear functional models for functional responses in reproducing kernel hilbert spaces , 2007, math/0702120.

[17]  José M. F. Moura,et al.  Block matrices with L-block-banded inverse: inversion algorithms , 2005, IEEE Transactions on Signal Processing.

[18]  Francis R. Bach,et al.  A New Approach to Collaborative Filtering: Operator Estimation with Spectral Regularization , 2008, J. Mach. Learn. Res..

[19]  J. Faraway Regression analysis for a functional response , 1997 .

[20]  C. Preda Regression models for functional data by reproducing kernel Hilbert spaces methods , 2007 .

[21]  Philippe Preux,et al.  A Generalized Kernel Approach to Structured Output Learning , 2013, ICML.

[22]  Peter Birkholz,et al.  Articulatory synthesis and perception of plosive-vowel syllables with virtual consonant targets , 2010, INTERSPEECH.

[23]  Joseph A. Ball,et al.  Review: Harry Dym, $J$ Contractive matrix functions, reproducing kernel Hilbert spaces and interpolation , 1990 .

[24]  RakotomamonjyAlain,et al.  Operator-valued kernels for learning from functional response data , 2016 .

[25]  Asma Rabaoui,et al.  Using One-Class SVMs and Wavelets for Audio Surveillance , 2008, IEEE Transactions on Information Forensics and Security.

[26]  Le Song,et al.  A Hilbert Space Embedding for Distributions , 2007, Discovery Science.

[27]  Vikas Sindhwani,et al.  Vector-valued Manifold Regularization , 2011, ICML.

[28]  Barnabás Póczos,et al.  Distribution-Free Distribution Regression , 2013, AISTATS.

[29]  Shai Ben-David,et al.  A notion of task relatedness yielding provable multiple-task learning guarantees , 2008, Machine Learning.

[30]  S. Canu,et al.  M L ] 6 O ct 2 00 9 Functional learning through kernel , 2009 .

[31]  Massimiliano Pontil,et al.  Convex multi-task feature learning , 2008, Machine Learning.

[32]  É. Senkene,et al.  Hilbert spaces of operator-valued functions , 1973 .

[33]  Julien Audiffren,et al.  Stability of Multi-Task Kernel Regression Algorithms , 2013, ACML.

[34]  Raymond D. Kent,et al.  X‐ray microbeam speech production database , 1990 .

[35]  Frédéric Ferraty,et al.  Curves discrimination: a nonparametric functional approach , 2003, Comput. Stat. Data Anal..

[36]  J. Friedman,et al.  Predicting Multivariate Responses in Multiple Linear Regression , 1997 .

[37]  Konstantinos G. Margaritis,et al.  A support vector approach to the acoustic-to-articulatory mapping , 2005, INTERSPEECH.

[38]  Sung Ha Kang,et al.  Image and Video Colorization Using Vector-Valued Reproducing Kernel Hilbert Spaces , 2010, Journal of Mathematical Imaging and Vision.

[39]  Vesa T. Peltonen,et al.  Computational auditory scene recognition , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[40]  Philippe Preux,et al.  Functional Regularized Least Squares Classication with Operator-valued Kernels , 2011, ICML.

[41]  Dani Byrd,et al.  TADA: An enhanced, portable Task Dynamics model in MATLAB , 2004 .

[42]  Keiichi Tokuda,et al.  Mapping from articulatory movements to vocal tract spectrum with Gaussian mixture model for articulatory speech synthesis , 2004, SSW.

[43]  Florence d'Alché-Buc,et al.  OKVAR-Boost: a novel boosting algorithm to infer nonlinear dynamics and interactions in gene regulatory networks , 2013, Bioinform..

[44]  Vikas Sindhwani,et al.  Scalable Matrix-valued Kernel Learning for High-dimensional Nonlinear Multivariate Regression and Granger Causality , 2012, UAI.

[45]  Korin Richmond,et al.  Estimating articulatory parameters from the acoustic speech signal , 2002 .

[46]  J. Ramsay When the data are functions , 1982 .

[47]  Massimiliano Pontil,et al.  Excess risk bounds for multitask learning with trace norm regularization , 2012, COLT.

[48]  John A. Rice,et al.  FUNCTIONAL AND LONGITUDINAL DATA ANALYSIS: PERSPECTIVES ON SMOOTHING , 2004 .

[49]  Jane-ling Wang,et al.  Functional linear regression analysis for longitudinal data , 2005, math/0603132.

[50]  Hans-Georg Ller,et al.  Functional Modelling and Classification of Longitudinal Data. , 2005 .

[51]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[52]  G. Wahba Spline models for observational data , 1990 .

[53]  Hans-Georg Müller,et al.  Functional Data Analysis , 2016 .

[54]  Ryan M. Rifkin,et al.  In Defense of One-Vs-All Classification , 2004, J. Mach. Learn. Res..

[55]  Neil D. Lawrence,et al.  Kernels for Vector-Valued Functions: a Review , 2011, Found. Trends Mach. Learn..

[56]  Fabrice Rossi,et al.  Support Vector Machine For Functional Data Classification , 2006, ESANN.

[57]  Katrin Kirchhoff,et al.  Robust speech recognition using articulatory information , 1998 .

[58]  Charles A. Micchelli,et al.  On Learning Vector-Valued Functions , 2005, Neural Computation.

[59]  George Michailidis,et al.  Operator-valued kernel-based vector autoregressive models for network inference , 2014, Machine Learning.

[60]  Xin Zhao,et al.  The functional data analysis view of longitudinal data , 2004 .

[61]  T. Hesterberg,et al.  Least angle and ℓ1 penalized regression: A review , 2008, 0802.0964.

[62]  J. Ramsay,et al.  Introduction to Functional Data Analysis , 2007 .

[63]  Frédéric Ferraty,et al.  Nonparametric models for functional data, with application in regression, time series prediction and curve discrimination , 2004 .

[64]  Michel Vacher,et al.  Information extraction from sound for medical telemonitoring , 2006, IEEE Transactions on Information Technology in Biomedicine.

[65]  Philippe Preux,et al.  Multiple Operator-valued Kernel Learning , 2012, NIPS.

[66]  Gareth M. James Generalized linear models with functional predictors , 2002 .

[67]  A. Caponnetto,et al.  Optimal Rates for the Regularized Least-Squares Algorithm , 2007, Found. Comput. Math..

[68]  Frédéric Ferraty,et al.  Nonparametric Functional Data Analysis: Theory and Practice (Springer Series in Statistics) , 2006 .

[69]  Charles A. Micchelli,et al.  Universal Multi-Task Kernels , 2008, J. Mach. Learn. Res..

[70]  J. Ramsay,et al.  Some Tools for Functional Data Analysis , 1991 .

[71]  Laurent Schwartz,et al.  Sous-espaces hilbertiens d’espaces vectoriels topologiques et noyaux associés (Noyaux reproduisants) , 1964 .

[72]  Jonathan Baxter,et al.  A Model of Inductive Bias Learning , 2000, J. Artif. Intell. Res..

[73]  Peter V. Gehler,et al.  Learning Output Kernels with Block Coordinate Descent , 2011, ICML.

[74]  Charles A. Micchelli,et al.  Learning Multiple Tasks with Kernel Methods , 2005, J. Mach. Learn. Res..

[75]  J. Dauxois,et al.  Asymptotic theory for the principal component analysis of a vector random function: Some applications to statistical inference , 1982 .

[76]  Lorenzo Rosasco,et al.  Multi-output learning via spectral filtering , 2012, Machine Learning.

[77]  S. Canu,et al.  Functional learning through kernel , 2002 .

[78]  Bernhard Schölkopf,et al.  Hilbert Space Embeddings and Metrics on Probability Measures , 2009, J. Mach. Learn. Res..

[79]  Carol Y. Espy-Wilson,et al.  From acoustics to Vocal Tract time functions , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[80]  Tony Jebara,et al.  Multi-task feature and kernel selection for SVMs , 2004, ICML.

[81]  Stephen J. Wright,et al.  Simultaneous Variable Selection , 2005, Technometrics.

[82]  Daniel Alpay,et al.  Schur Functions, Operator Colligations, and Reproducing Kernel Pontryagin Spaces , 1997 .

[83]  Barnabás Póczos,et al.  Support Distribution Machines , 2012, ArXiv.

[84]  N. Aronszajn Theory of Reproducing Kernels. , 1950 .

[85]  Hyunjoong Kim,et al.  Functional Analysis I , 2017 .

[86]  Timo Similä,et al.  Input selection and shrinkage in multiresponse linear regression , 2007, Comput. Stat. Data Anal..

[87]  Piotr Kokoszka,et al.  Inference for Functional Data with Applications , 2012 .

[88]  Stéphane Canu,et al.  Multiple functional regression with both discrete and continuous covariates , 2011, ArXiv.

[89]  B. Silverman,et al.  Estimating the mean and covariance structure nonparametrically when the data are curves , 1991 .

[90]  T. Choi,et al.  Gaussian Process Regression Analysis for Functional Data , 2011 .

[91]  P. Sarda,et al.  Spline Estimator for the Functional Linear Regression with Functional Response , 2007 .

[92]  Korin Richmond A multitask learning perspective on acoustic-articulatory inversion , 2007, INTERSPEECH.

[93]  Bernhard Schölkopf,et al.  Learning from Distributions via Support Measure Machines , 2012, NIPS.

[94]  Philippe Preux,et al.  Learning vocal tract variables with multi-task kernels , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[95]  Man Mohan Sondhi,et al.  Techniques for estimating vocal-tract shapes from the speech signal , 1994, IEEE Trans. Speech Audio Process..

[96]  C. Tretter Spectral Theory Of Block Operator Matrices And Applications , 2008 .

[97]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[98]  Gunnar Rätsch,et al.  Input space versus feature space in kernel-based methods , 1999, IEEE Trans. Neural Networks.

[99]  Arch W. Naylor,et al.  Linear Operator Theory in Engineering and Science , 1971 .

[100]  Julien Audiffren,et al.  Online Learning with Operator-valued Kernels , 2015, ESANN.

[101]  Stéphane Ayache,et al.  The Multi-Task Learning View of Multimodal Data , 2013, ACML.

[102]  Fausto Pellandini,et al.  Automatic sound detection and recognition for noisy environment , 2000, 2000 10th European Signal Processing Conference.

[103]  Andreas Maurer,et al.  Bounds for Linear Multi-Task Learning , 2006, J. Mach. Learn. Res..

[104]  Yuesheng Xu,et al.  Refinement of Operator-valued Reproducing Kernels , 2011, J. Mach. Learn. Res..

[105]  Jing Peng,et al.  SVM vs regularized least squares classification , 2004, ICPR 2004.

[106]  Stéphane Canu,et al.  Nonlinear functional regression: a functional RKHS approach , 2010, AISTATS.

[107]  Barnabás Póczos,et al.  Distribution to Distribution Regression , 2013, ICML.

[108]  Henry W. Altland,et al.  Applied Functional Data Analysis , 2003, Technometrics.