Functional-bandwidth kernel for Support Vector Machine with Functional Data: An alternating optimization algorithm

Abstract Functional Data Analysis (FDA) is devoted to the study of data which are functions. Support Vector Machine (SVM) is a benchmark tool for classification, in particular, of functional data. SVM is frequently used with a kernel (e.g.: Gaussian) which involves a scalar bandwidth parameter. In this paper, we propose to use kernels with functional bandwidths. In this way, accuracy may be improved, and the time intervals critical for classification are identified. Tuning the functional parameters of the new kernel is a challenging task expressed as a continuous optimization problem, solved by means of a heuristic. Our experiments with benchmark data sets show the advantages of using functional parameters and the effectiveness of our approach.

[1]  Richard Weber,et al.  Simultaneous feature selection and classification using kernel-penalized support vector machines , 2011, Inf. Sci..

[2]  J. Romo,et al.  On the Concept of Depth for Functional Data , 2009 .

[3]  Thomas Villmann,et al.  Generalized relevance learning vector quantization , 2002, Neural Networks.

[4]  Javier González,et al.  Representing functional data using support vector machines , 2008, Pattern Recognit. Lett..

[5]  Emilio Carrizosa,et al.  A global optimization method for model selection in chemical reactions networks , 2016, Comput. Chem. Eng..

[6]  Juan Antonio Cuesta-Albertos,et al.  Supervised Classification for a Family of Gaussian Functional Models , 2010, 1004.5031.

[7]  Sang Won Yoon,et al.  A support vector machine-based ensemble algorithm for breast cancer diagnosis , 2017, Eur. J. Oper. Res..

[8]  Juan Romo,et al.  Interpretable support vector machines for functional data , 2014, Eur. J. Oper. Res..

[9]  Manuel Febrero-Bande,et al.  Statistical Computing in Functional Data Analysis: The R Package fda.usc , 2012 .

[10]  S. Sain Multivariate locally adaptive density estimation , 2002 .

[11]  Inge Koch,et al.  Feature significance for multivariate kernel density estimation , 2008, Comput. Stat. Data Anal..

[12]  Sebastián Maldonado,et al.  Cost-based feature selection for Support Vector Machines: An application in credit scoring , 2017, Eur. J. Oper. Res..

[13]  Chin-Tsang Chiang,et al.  Asymptotic Confidence Regions for Kernel Smoothing of a Varying-Coefficient Model With Longitudinal Data , 1998 .

[14]  Atsushi Sato,et al.  Generalized Learning Vector Quantization , 1995, NIPS.

[15]  David Garcia-Dorado,et al.  Cariporide preserves mitochondrial proton gradient and delays ATP depletion in cardiomyocytes during ischemic conditions. , 2003, American journal of physiology. Heart and circulatory physiology.

[16]  P. Hall,et al.  Achieving near perfect classification for functional data , 2012 .

[17]  Antonio Cuevas,et al.  Variable selection in functional data classification: a maxima-hunting proposal , 2013, 1309.6697.

[18]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[19]  Fabrice Rossi,et al.  Support Vector Machine For Functional Data Classification , 2006, ESANN.

[20]  Peter Richtárik,et al.  Parallel coordinate descent methods for big data optimization , 2012, Mathematical Programming.

[21]  Gilbert Saporta,et al.  PLS classification of functional data , 2005, Comput. Stat..

[22]  Alex Smola,et al.  Kernel methods in machine learning , 2007, math/0701907.

[23]  H. Akaike A new look at the statistical model identification , 1974 .

[24]  Sebastián Maldonado,et al.  Automatic feature scaling and selection for support vector machine classification with functional data , 2020, Applied Intelligence.

[25]  Li Wei,et al.  Semi-supervised time series classification , 2006, KDD '06.

[26]  Michael C. Ferris,et al.  Semismooth support vector machines , 2004, Math. Program..

[27]  Ma Yao,et al.  Fault detection of batch processes based on multivariate functional kernel principal component analysis , 2015 .

[28]  Hans-Georg Müller Functional Data Analysis. , 2011 .

[29]  David J. Sandoz,et al.  The application of principal component analysis and kernel density estimation to enhance process monitoring , 2000 .

[30]  Emilio Carrizosa,et al.  Supervised classification and mathematical optimization , 2013, Comput. Oper. Res..

[31]  Maria L. Rizzo,et al.  Measuring and testing dependence by correlation of distances , 2007, 0803.4101.

[32]  Gérard Govaert,et al.  Assessing a Mixture Model for Clustering with the Integrated Completed Likelihood , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[33]  Emilio Carrizosa,et al.  A nested heuristic for parameter tuning in Support Vector Machines , 2014, Comput. Oper. Res..

[34]  Johan A. K. Suykens,et al.  Least Squares Support Vector Machine Classifiers , 1999, Neural Processing Letters.

[35]  Ricardo Fraiman,et al.  On the use of the bootstrap for estimating functions with functional data , 2006, Comput. Stat. Data Anal..

[36]  Gert Cauwenberghs,et al.  Incremental and Decremental Support Vector Machine Learning , 2000, NIPS.

[37]  J. Paul Brooks,et al.  Support Vector Machines with the Ramp Loss and the Hard Margin Loss , 2011, Oper. Res..

[38]  Philip S. Yu,et al.  Early prediction on time series: a nearest neighbor approach , 2009, IJCAI 2009.

[39]  Stefan Lessmann,et al.  A reference model for customer-centric data mining with support vector machines , 2009, Eur. J. Oper. Res..

[40]  Christophe Croux,et al.  An Information Criterion for Variable Selection in Support Vector Machines , 2007 .

[41]  Thomas Villmann,et al.  Functional relevance learning in generalized learning vector quantization , 2012, Neurocomputing.

[42]  Chih-Jen Lin,et al.  Asymptotic Behaviors of Support Vector Machines with Gaussian Kernel , 2003, Neural Computation.

[43]  Jianqing Fan,et al.  Functional-Coefficient Regression Models for Nonlinear Time Series , 2000 .

[44]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[45]  Fabrice Rossi,et al.  Recent Advances in the Use of SVM for Functional Data Classification , 2008 .

[46]  Emilio Carrizosa,et al.  On Extreme Concentrations in Chemical Reaction Networks with Incomplete Measurements , 2016 .

[47]  Algirdas Laukaitis,et al.  Functional data analysis for clients segmentation tasks , 2005, Eur. J. Oper. Res..

[48]  Stéphane Canu,et al.  Nonlinear functional regression: a functional RKHS approach , 2010, AISTATS.

[49]  Emilio Carrizosa,et al.  Variable selection in classification for multivariate functional data , 2019, Inf. Sci..

[50]  Patrice Marcotte,et al.  An overview of bilevel optimization , 2007, Ann. Oper. Res..

[51]  Ricardo Fraiman,et al.  Robust estimation and classification for functional data via projection-based depth notions , 2007, Comput. Stat..

[52]  Ashish Sood,et al.  Functional Regression: A New Model for Predicting Market Penetration of New Products , 2009, Mark. Sci..

[53]  Florentina Bunea,et al.  Functional classification in Hilbert spaces , 2005, IEEE Transactions on Information Theory.

[54]  Ming-Ying Leung,et al.  Least-Squares Support Vector Machine Approach to Viral Replication Origin Prediction , 2010, INFORMS J. Comput..