Learning distance to subspace for the nearest subspace methods in high-dimensional data classification

The nearest subspace methods (NSM) are a category of classification methods widely applied to classify high-dimensional data. In this paper, we propose to improve the classification performance of NSM through learning tailored distance metrics from samples to class subspaces. The learned distance metric is termed as ‘learned distance to subspace’ (LD2S). Using LD2S in the classification rule of NSM can make the samples closer to their correct class subspaces while farther away from their wrong class subspaces. In this way, the classification task becomes easier and the classification performance of NSM can be improved. The superior classification performance of using LD2S for NSM is demonstrated on three real-world high-dimensional spectral datasets.

[1]  Jing-Hao Xue,et al.  Building a discriminatively ordered subspace on the generating matrix to classify high-dimensional spectral data , 2017, Inf. Sci..

[2]  Tom Fearn,et al.  A Hierarchical Discriminant Analysis for Species Identification in Raw Meat by Visible and near Infrared Spectroscopy , 2004 .

[3]  Qinghua Hu,et al.  Multi-granularity distance metric learning via neighborhood granule margin maximization , 2014, Inf. Sci..

[4]  Federico Pallottino,et al.  A multivariate SIMCA index as discriminant in wood pellet quality assessment , 2015 .

[5]  P. Vieu,et al.  Nonparametric Functional Data Analysis: Theory and Practice (Springer Series in Statistics) , 2006 .

[6]  David J. Kriegman,et al.  Acquiring linear subspaces for face recognition under variable lighting , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Peter Hall,et al.  On selecting interacting features from high-dimensional data , 2014, Comput. Stat. Data Anal..

[8]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[9]  Svante Wold,et al.  Pattern recognition by means of disjoint principal components models , 1976, Pattern Recognit..

[10]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[11]  Yuejie Chi Nearest subspace classification with missing data , 2013, 2013 Asilomar Conference on Signals, Systems and Computers.

[12]  M. Hubert,et al.  Robust classification in high dimensions based on the SIMCA Method , 2005 .

[13]  Miodrag Bolic,et al.  A comparative study of PCA, SIMCA and Cole model for classification of bioimpedance spectroscopy measurements , 2015, Comput. Biol. Medicine.

[14]  Jef Vanlaer,et al.  Analysis of smearing-out in contribution plot based fault isolation for Statistical Process Control , 2013 .

[15]  Weida Zhou,et al.  Nonlinear Nearest Subspace Classifier , 2011, ICONIP.

[16]  Atsuto Maki,et al.  Difference Subspace and Its Generalization for Subspace-Based Methods , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Baligh Mnassri,et al.  Reconstruction-based contribution approaches for improved fault diagnosis using principal component analysis , 2015 .

[18]  Baligh Mnassri,et al.  Fault Detection and Diagnosis Based on PCA and a New Contribution Plot , 2009 .

[19]  P. Hall,et al.  Incorporating prior probabilities into high-dimensional classifiers , 2010 .

[20]  Jean-Michel Kauffmann,et al.  Identification of coffee leaves using FT-NIR spectroscopy and SIMCA. , 2018, Talanta.

[21]  Fatih Murat Porikli,et al.  Connecting the dots in multi-class classification: From nearest subspace to collaborative representation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Qing Tian,et al.  Ordinal margin metric learning and its extension for cross-distribution image data , 2016, Inf. Sci..

[23]  De-Shuang Huang,et al.  The nearest-farthest subspace classification for face recognition , 2013, Neurocomputing.

[24]  David M. Laverty,et al.  Real-Time Multiple Event Detection and Classification Using Moving Window PCA , 2016, IEEE Transactions on Smart Grid.

[25]  ChengJun,et al.  Semantic preserving distance metric learning and applications , 2014 .

[26]  Jing-Hao Xue,et al.  On the orthogonal distance to class subspaces for high-dimensional data classification , 2017, Inf. Sci..

[27]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[28]  D. M. Titterington,et al.  Median-Based Classifiers for High-Dimensional Data , 2009 .

[29]  R. Bro,et al.  A classification tool for N-way array based on SIMCA methodology , 2011 .