High dimensional data are often modeled by signal plus noise where the signal belongs to a low dimensional manifold contaminated with high dimensional noise. Estimating the signal subspace when the noise is Gaussian and the signal is non-Gaussian is the main focus of this paper. We assume that the Gaussian noise variance can be high, so standard denoising approaches like Principal Component Analysis fail. The approach also differs from standard Independent Component Analysis in that no independent signal factors are assumed. This model is called non-Gaussian subspace/component analysis (NGCA). The previous approaches proposed for this subspace analysis use the fourth cumulant matrix or the Hessian of the logarithm of characteristic functions, which both have some practical and theoretical issues. We propose to use sample Density Gradient Covariances, which are similar to the Fisher information matrix for estimating the non-Gaussian subspace. Here, we use nonparametric kernel density estimator to estimate the gradients of density functions. Moreover, we extend the notion of non-Gaussian subspace analysis to a supervised version where the label or response information is present. For the supervised non-Gaussian subspace analysis, we propose to use conditional density gradient covariances which are computed by conditioning on the discretized response variable. A non-asymptotic analysis of density gradient covariance is also provided which relates the error of estimating the population DGC matrix using sample DGC to the number of dimensions and the number of samples.
[1]
Seungjin Choi,et al.
Independent Component Analysis
,
2009,
Handbook of Natural Computing.
[2]
A. Tsybakov,et al.
Nonparametric independent component analysis
,
2004
.
[3]
Motoaki Kawanabe,et al.
Joint low-rank approximation for extracting non-Gaussian subspaces
,
2007,
Signal Process..
[4]
M. Rudelson.
Random Vectors in the Isotropic Position
,
1996,
math/9608208.
[5]
F. Y. Edgeworth,et al.
The theory of statistics
,
1996
.
[6]
Ker-Chau Li.
Sliced inverse regression for dimension reduction (with discussion)
,
1991
.
[7]
Motoaki Kawanabe,et al.
In Search of Non-Gaussian Components of a High-Dimensional Distribution
,
2006,
J. Mach. Learn. Res..
[8]
Jon A. Wellner,et al.
Weak Convergence and Empirical Processes: With Applications to Statistics
,
1996
.
[9]
Motoaki Kawanabe,et al.
Linear Dimension Reduction Based on the Fourth-Order Cumulant Tensor
,
2005,
ICANN.
[10]
H. Kile,et al.
Bandwidth Selection in Kernel Density Estimation
,
2010
.
[11]
Ker-Chau Li,et al.
Sliced Inverse Regression for Dimension Reduction
,
1991
.