Nonlinear Dimensionality Reduction for Discriminative Analytics of Multiple Datasets

Principal component analysis (PCA) is widely used for feature extraction and dimensionality reduction, with documented merits in diverse tasks involving high-dimensional data. PCA copes with one dataset at a time, but it is challenged when it comes to analyzing multiple datasets jointly. In certain data science settings however, one is often interested in extracting the most discriminative information from one dataset of particular interest (a.k.a. target data) relative to the other(s) (a.k.a. background data). To this end, this paper puts forth a novel approach, termed discriminative (d) PCA, for such discriminative analytics of multiple datasets. Under certain conditions, dPCA is proved to be least-squares optimal in recovering the latent subspace vector unique to the target data relative to background data. To account for nonlinear data correlations, (linear) dPCA models for one or multiple background datasets are generalized through kernel-based learning. Interestingly, all dPCA variants admit an analytical solution obtainable with a single (generalized) eigenvalue decomposition. Finally, substantial dimensionality reduction tests using synthetic and real datasets are provided to corroborate the merits of the proposed methods.

[1]  Karl Pearson F.R.S. LIII. On lines and planes of closest fit to systems of points in space , 1901 .

[2]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[3]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[4]  N. Aronszajn Theory of Reproducing Kernels. , 1950 .

[5]  J. Kruskal Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis , 1964 .

[6]  Bin Yang,et al.  Projection approximation subspace tracking , 1995, IEEE Trans. Signal Process..

[7]  Bernhard Schölkopf,et al.  Kernel Principal Component Analysis , 1997, ICANN.

[8]  S. Garte The role of ethnicity in cancer susceptibility gene polymorphisms: the example of CYP1A1. , 1998, Carcinogenesis.

[9]  B. Scholkopf,et al.  Fisher discriminant analysis with kernels , 1999, Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No.98TH8468).

[10]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[11]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[12]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[13]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[14]  Jing-Yu Yang,et al.  A generalized Foley-Sammon transform based on generalized fisher discriminant criterion and its application to face recognition , 2003, Pattern Recognit. Lett..

[15]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[16]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[17]  Yousef Saad,et al.  Iterative methods for sparse linear systems , 2003 .

[18]  Michael I. Jordan,et al.  Multiple kernel learning, conic duality, and the SMO algorithm , 2004, ICML.

[19]  David J. Kriegman,et al.  Acquiring linear subspaces for face recognition under variable lighting , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Pavel Pudil,et al.  Introduction to Statistical Pattern Recognition , 2006 .

[21]  Sanja Fidler,et al.  Combining reconstructive and discriminative subspace methods for robust classification and regression by subsampling , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[23]  Umberto Castellani,et al.  Multiple kernel learning , 2009 .

[24]  Zohreh Azimifar,et al.  Supervised principal component analysis: Visualization, classification and regression on subspaces and submanifolds , 2011, Pattern Recognit..

[25]  Gonzalo Mateos,et al.  Robust PCA as Bilinear Decomposition With Outlier-Sparsity Regularization , 2011, IEEE Transactions on Signal Processing.

[26]  H. Abdi,et al.  Multiple factor analysis: principal component analysis for multitable and multiblock data sets , 2013 .

[27]  Mati Wax,et al.  Single-Site Localization via Maximum Discrimination Multipath Fingerprinting , 2014, IEEE Transactions on Signal Processing.

[28]  K. Cios,et al.  Self-Organizing Feature Maps Identify Proteins Critical to Learning in a Mouse Model of Down Syndrome , 2015, PloS one.

[29]  Jianqing Fan,et al.  QUADRO: A SUPERVISED DIMENSION REDUCTION METHOD VIA RAYLEIGH QUOTIENT OPTIMIZATION. , 2013, Annals of statistics.

[30]  Ignacio Rojas,et al.  Design, implementation and validation of a novel open framework for agile development of mobile health applications , 2015, BioMedical Engineering OnLine.

[31]  Jia Chen,et al.  Online Distributed Sparsity-Aware Canonical Correlation Analysis , 2016, IEEE Transactions on Signal Processing.

[32]  Feiping Nie,et al.  Discriminative Vanishing Component Analysis , 2016, AAAI.

[33]  Nathanael Perraudin,et al.  Fast Robust PCA on Graphs , 2015, IEEE Journal of Selected Topics in Signal Processing.

[34]  Li Zhang,et al.  Joint Low-Rank and Sparse Principal Feature Coding for Enhanced Robust Representation and Visual Classification , 2016, IEEE Transactions on Image Processing.

[35]  Gang Wang,et al.  Going beyond linear dependencies to unveil connectivity of meshed grids , 2017, 2017 IEEE 7th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP).

[36]  George Atia,et al.  Coherence Pursuit: Fast, Simple, and Robust Principal Component Analysis , 2016, IEEE Transactions on Signal Processing.

[37]  Gang Wang,et al.  Randomized Block Frank–Wolfe for Convergent Large-Scale Learning , 2016, IEEE Transactions on Signal Processing.

[38]  Vivek Kumar Bagaria,et al.  Contrastive Principal Component Analysis , 2017, ArXiv.

[39]  Paulo S. R. Diniz,et al.  A Fixed-Point Online Kernel Principal Component Extraction Algorithm , 2017, IEEE Transactions on Signal Processing.

[40]  Gang Wang,et al.  Nonlinear Discriminative Dimensionality Reduction of Multiple Datasets , 2018, 2018 52nd Asilomar Conference on Signals, Systems, and Computers.

[41]  Panos P. Markopoulos,et al.  L1-Norm Principal-Component Analysis of Complex Data , 2017, IEEE Transactions on Signal Processing.

[42]  Gang Wang,et al.  Canonical Correlation Analysis of Datasets With a Common Source Graph , 2018, IEEE Transactions on Signal Processing.

[43]  Georgios B. Giannakis,et al.  Online Ensemble Multi-kernel Learning Adaptive to Non-stationary and Adversarial Environments , 2017, AISTATS.

[44]  Gang Wang,et al.  Dpca: Dimensionality Reduction for Discriminative Analytics of Multiple Large-Scale Datasets , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[45]  Georgios B. Giannakis,et al.  Topology Identification and Learning over Graphs: Accounting for Nonlinearities and Dynamics , 2018, Proceedings of the IEEE.

[46]  Gang Wang,et al.  Graph Multiview Canonical Correlation Analysis , 2018, IEEE Transactions on Signal Processing.