Combining Linear Dimension Reduction Subspaces

Dimensionality is a major concern in the analysis of large data sets. There are various well-known dimension reduction methods with different strengths and weaknesses. In practical situations it is difficult to decide which method to use as different methods emphasize different structures in the data. Like ensemble methods in statistical learning, several dimension reduction methods can be combined using an extension of the Crone and Crosby distance, a weighted distance between the subspaces that allows to combine subspaces of different dimensions. Some natural choices of weights are considered in detail. Based on the weighted distance we discuss the concept of averages of subspaces and how to combine various dimension reduction methods. The performance of the weighted distances and the combining approach is illustrated via simulations and a real data example.

[1]  Ker-Chau Li,et al.  Sliced Inverse Regression for Dimension Reduction , 1991 .

[2]  S. Weisberg Dimension Reduction Regression in R , 2002 .

[3]  R. Randles,et al.  A practical affine equivariant multivariate median , 2002 .

[4]  OpitzDavid,et al.  Popular ensemble methods , 1999 .

[5]  David E. Tyler,et al.  Invariant co‐ordinate selection , 2009 .

[6]  P. Rousseeuw Multivariate estimation with high breakdown point , 1985 .

[7]  Klaus Nordhausen,et al.  Deflation-Based FastICA With Adaptive Choices of Nonlinearities , 2014, IEEE Transactions on Signal Processing.

[8]  William N. Venables,et al.  Modern Applied Statistics with S , 2010 .

[9]  Christophe Croux,et al.  High breakdown estimators for principal components: the projection-pursuit approach revisited , 2005 .

[10]  Klaus Nordhausen,et al.  Tools for Exploring Multivariate Data: The Package ICS , 2008 .

[11]  Luke A. Prendergast,et al.  Iterative application of dimension reduction methods , 2011 .

[12]  Alain Berro,et al.  Detecting Multivariate Outliers Using Projection Pursuit with Particle Swarm Optimization , 2010, COMPSTAT.

[13]  Klaus Nordhausen,et al.  Multivariate L1 Statistical Methods: The Package MNM , 2011 .

[14]  Y. Escoufier LE TRAITEMENT DES VARIABLES VECTORIELLES , 1973 .

[15]  Ker-Chau Li,et al.  On Principal Hessian Directions for Data Visualization and Dimension Reduction: Another Application of Stein's Lemma , 1992 .

[16]  Jason F. Ralph,et al.  Automatic Induction of Projection Pursuit Indices , 2010, IEEE Transactions on Neural Networks.

[17]  David E. Tyler A Distribution-Free $M$-Estimator of Multivariate Scatter , 1987 .

[18]  L. Crone,et al.  Statistical applications of a metric on subspaces to satellite meteorology , 1995 .

[19]  John W. Tukey,et al.  A Projection Pursuit Algorithm for Exploratory Data Analysis , 1974, IEEE Transactions on Computers.

[20]  S. Weisberg,et al.  Comments on "Sliced inverse regression for dimension reduction" by K. C. Li , 1991 .

[21]  Klaus Nordhausen,et al.  Deflation-based FastICA reloaded , 2011, 2011 19th European Signal Processing Conference.

[22]  Aapo Hyvärinen,et al.  Fast and robust fixed-point algorithms for independent component analysis , 1999, IEEE Trans. Neural Networks.

[23]  Walter Krämer,et al.  Review of Modern applied statistics with S, 4th ed. by W.N. Venables and B.D. Ripley. Springer-Verlag 2002 , 2003 .

[24]  R. Weiss,et al.  Using the Bootstrap to Select One of a New Class of Dimension Reduction Methods , 2003 .

[25]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[26]  K. Nordhausen,et al.  Supervised invariant coordinate selection , 2014 .