On the orthogonal distance to class subspaces for high-dimensional data classification

The orthogonal distance from an instance to the subspace of a class is a key metric for pattern classification by the class subspace-based methods. There is a close relationship between the orthogonal distance and the residual standard deviation of a test instance from the class subspace. In this paper, we shall show that an established and widely-used relationship, between the residual standard deviation and the sum of squares of the residual PC scores, is not precise, and thus can lead to incorrect results, for the inference of high-dimensional data which nowadays are common in practice.

[1]  Jef Vanlaer,et al.  Analysis of smearing-out in contribution plot based fault isolation for Statistical Process Control , 2013 .

[2]  Sergey V. Kucheryavskiy,et al.  DD-SIMCA – A MATLAB GUI tool for data driven SIMCA approach , 2017 .

[3]  Maria Fernanda Pimentel,et al.  Classification of Brazilian and foreign gasolines adulterated with alcohol using infrared spectroscopy. , 2015, Forensic science international.

[4]  David M. Laverty,et al.  Real-Time Multiple Event Detection and Classification Using Moving Window PCA , 2016, IEEE Transactions on Smart Grid.

[5]  Oxana Ye. Rodionova,et al.  On the type II error in SIMCA method , 2014 .

[6]  D. L. Massart,et al.  Decision criteria for soft independent modelling of class analogy applied to near infrared data , 1999 .

[7]  D. Massart,et al.  The influence of data pre-processing in the pattern recognition of excipients near-infrared spectra. , 1999, Journal of Pharmaceutical and Biomedical Analysis.

[8]  D L Massart,et al.  Identification of pharmaceutical excipients using NIR spectroscopy and SIMCA. , 1999, Journal of pharmaceutical and biomedical analysis.

[9]  Baligh Mnassri,et al.  Reconstruction-based contribution approaches for improved fault diagnosis using principal component analysis , 2015 .

[10]  Oxana Ye. Rodionova,et al.  Rigorous and compliant approaches to one-class classification , 2016 .

[11]  Svante Wold,et al.  Pattern recognition by means of disjoint principal components models , 1976, Pattern Recognit..

[12]  Shuyuan Yang,et al.  Global discriminative-based nonnegative spectral clustering , 2016, Pattern Recognit..

[13]  Baligh Mnassri,et al.  Fault Detection and Diagnosis Based on PCA and a New Contribution Plot , 2009 .

[14]  Carlo Di Bello,et al.  PCA disjoint models for multiclass cancer analysis using gene expression data , 2003, Bioinform..

[15]  Jing-Hao Xue,et al.  Building a discriminatively ordered subspace on the generating matrix to classify high-dimensional spectral data , 2017, Inf. Sci..

[16]  S. D. Jong,et al.  Handbook of Chemometrics and Qualimetrics , 1998 .

[17]  D. Massart,et al.  The Mahalanobis distance , 2000 .

[18]  M. Hubert,et al.  Robust classification in high dimensions based on the SIMCA Method , 2005 .

[19]  Frédéric Ferraty,et al.  Nonparametric Functional Data Analysis: Theory and Practice (Springer Series in Statistics) , 2006 .

[20]  Beata Walczak,et al.  Robust SIMCA-bounding influence of outliers , 2007 .

[21]  Fei Wang,et al.  Fast semi-supervised clustering with enhanced spectral embedding , 2012, Pattern Recognit..

[22]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[23]  A. Pomerantsev Acceptance areas for multivariate classification derived by projection methods , 2008 .

[24]  Yangyang Li,et al.  Self-representation based dual-graph regularized feature selection clustering , 2016, Neurocomputing.

[25]  Quansheng Chen,et al.  Feasibility study on qualitative and quantitative analysis in tea by near infrared spectroscopy with multivariate calibration. , 2006, Analytica chimica acta.

[26]  A. Pomerantsev,et al.  Concept and role of extreme objects in PCA/SIMCA , 2014 .

[27]  Ankit Bansal,et al.  Chemometrics tools used in analytical chemistry: an overview. , 2014, Talanta.

[28]  Tom Fearn,et al.  A Hierarchical Discriminant Analysis for Species Identification in Raw Meat by Visible and near Infrared Spectroscopy , 2004 .

[29]  Mary R. Williams,et al.  Progress Toward the Determination of Correct Classification Rates in Fire Debris Analysis , , , 2013, Journal of forensic sciences.

[30]  J. Sádecká,et al.  Determination of geographical origin of alcoholic beverages using ultraviolet, visible and infrared spectroscopy: A review. , 2015, Spectrochimica acta. Part A, Molecular and biomolecular spectroscopy.