Classification of functional fragments by regularized linear classifiers with domain selection

We consider the problem of classification of functional data into two groups by linear classifiers based on one-dimensional projections of functions. We reformulate the task to find the best classifier as an optimization problem and solve it by regularization techniques, namely the conjugate gradient method with early stopping, the principal component method and the ridge method. We study the empirical version with finite training samples consisting of incomplete functions observed on different subsets of the domain and show that the optimal, possibly zero, misclassification probability can be achieved in the limit along a possibly non-convergent empirical regularization path. Being able to work with fragmentary training data we propose a domain extension and selection procedure that finds the best domain beyond the common observation domain of all curves. In a simulation study we compare the different regularization methods and investigate the performance of domain selection. Our methodology is illustrated on a medical data set, where we observe a substantial improvement of classification accuracy due to domain extension.

[1]  Piotr Kokoszka,et al.  Inference for Functional Data with Applications , 2012 .

[2]  Simone Vantini,et al.  Analysis of AneuRisk65 data: $k$-mean alignment , 2014 .

[3]  H. Müller,et al.  Optimal Bayes classifiers for functional data and density ratios , 2016, 1605.03707.

[4]  Juan Antonio Cuesta-Albertos,et al.  Supervised Classification for a Family of Gaussian Functional Models , 2010, 1004.5031.

[5]  Marie-Hélène Descary Recovering covariance from functional fragments , 2017, Biometrika.

[6]  Simone Vantini,et al.  Efficient estimation of three‐dimensional curves and their derivatives by free‐knot regression splines, applied to the analysis of inner carotid artery centrelines , 2009 .

[7]  Aurore Delaigle,et al.  Approximating fragmented functional data by segments of Markov chains , 2016 .

[8]  A. Veneziani,et al.  A Case Study in Exploratory Functional Data Analysis: Geometrical Features of the Internal Carotid Artery , 2009 .

[9]  Simone Vantini,et al.  AneuRisk65: A dataset of three-dimensional cerebral vascular geometries , 2014 .

[10]  R. Bro,et al.  PLS works , 2009 .

[11]  Dominik Liebl,et al.  Modeling and forecasting electricity spot prices: A functional data perspective , 2013, 1310.1628.

[12]  A. Cuevas A partial overview of the theory of statistics with functional data , 2014 .

[13]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[14]  Ricardo Fraiman,et al.  Classification methods for functional data , 2018, Oxford Handbooks Online.

[15]  Antonio Cuevas,et al.  Variable selection in functional data classification: a maxima-hunting proposal , 2013, 1309.6697.

[16]  S. D. Jong SIMPLS: an alternative approach to partial least squares regression , 1993 .

[17]  P. Hall,et al.  Achieving near perfect classification for functional data , 2012 .

[18]  Ole Christian Lingjærde,et al.  Shrinkage Structure of Partial Least Squares , 2000 .

[19]  Aurore Delaigle,et al.  Componentwise classification and clustering of functional data , 2012 .

[20]  Avishai Mandelbaum,et al.  Predicting the continuation of a function with applications to call center data , 2014 .

[21]  Wenceslao González-Manteiga,et al.  Functional Principal Component Regression and Functional Partial Least‐squares Regression: An Overview and a Comparative Study , 2017 .

[22]  Piotr Kokoszka,et al.  Evaluation of the cooling trend in the ionosphere using functional regression with incomplete curves , 2017 .

[23]  Aurore Delaigle,et al.  Methodology and theory for partial least squares applied to functional data , 2012, 1205.6367.

[24]  Alessia Pini,et al.  The interval testing procedure: A general framework for inference in functional data analysis , 2016, Biometrics.

[25]  Enea G. Bongiorno,et al.  Classification methods for Hilbert data based on surrogate density , 2015, Comput. Stat. Data Anal..

[26]  Denis Bosq,et al.  Linear Processes in Function Spaces , 2000 .

[27]  Juan Antonio Cuesta-Albertos,et al.  The random projection method in goodness of fit for functional data , 2007, Comput. Stat. Data Anal..

[28]  José R. Berrendero,et al.  On the Use of Reproducing Kernel Hilbert Spaces in Functional Classification , 2015, Journal of the American Statistical Association.

[29]  Aurore Delaigle,et al.  Classification Using Censored Functional Data , 2013 .

[30]  Pierpaolo Brutti,et al.  PCA‐based discrimination of partially observed functional data, with an application to AneuRisk65 data set , 2018 .

[31]  David Kraus,et al.  Components and completion of partially observed functional data , 2015 .

[32]  A. Phatak,et al.  Exploiting the connection between PLS, Lanczos methods and conjugate gradients: alternative proofs of some properties of PLS , 2002 .

[33]  Gilles Blanchard,et al.  Kernel Partial Least Squares is Universally Consistent , 2010, AISTATS.

[34]  Federico A. Bugni SPECIFICATION TEST FOR MISSING FUNCTIONAL DATA , 2011, Econometric Theory.

[35]  P. Sarda,et al.  Functional linear model , 1999 .

[36]  B. Silverman,et al.  Functional Data Analysis , 1997 .

[37]  Frédéric Ferraty,et al.  Most-predictive design points for functional data predictors , 2010 .

[38]  D. Bosq Linear Processes in Function Spaces: Theory And Applications , 2000 .

[39]  T. Hsing,et al.  Theoretical foundations of functional data analysis, with an introduction to linear operators , 2015 .