Independent Component Analysis: An Approach to Clustering
暂无分享,去创建一个
Independent component analysis (ICA) (Hyvarinen et al., 2001), and projection pursuit (PP) (Jones and Sibson, 1987), are closely related techniques, which try to look for “interesting” directions (projections) in the data. ICA assumes a model, , where is a vector of observed random variables, is a “mixing” matrix, and is a vector of independent latent variables. The task then is to find to recover . A key assumption is usually that the have different kurtosises , in order to separate the different independent components. In practice ICA usually measures “interestingness” of a linear combination in terms of the size of its absolute kurtosis or some related measures. Since for a Gaussian random variables the kurtosis is zero, this criterion measures to some extent, non-Gaussianity. In this poster, we are interested in finding a clustering procedure that can be applied for exploratory analysis in large datasets. One specific use for ICA is to pick out clusters from multi-dimensional data via projection. The objective is to find one or more “interesting” directions. It turns out that in the clustering direction, kurtosis is usually negative. Hence it is useful in practice to use a modified version of ICA in which we minimise rather than maximise . In this approach, the one-dimensional projection of the multi-dimensional data is used to provide the most interesting view for clustering from the full-dimensional data. However, this approach based on kurtosis is not robust to outliers, and hence a variety of other robust alternative approaches have been suggested (e.g. negentropy and general contrast functions). Then, by means of “density based” approaches to clusterings (e.g. kernel density estimation clustering (Silverman, 1986) or scale-space clustering (Lindeberg, 1994)) we can plot and explore the one-dimensional projected data, to obtain potential clusters.
[1] R. Fisher. THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .
[2] Erkki Oja,et al. Independent Component Analysis , 2001 .
[3] Robin Sibson,et al. What is projection pursuit , 1987 .
[4] Tony Lindeberg,et al. Scale-Space Theory in Computer Vision , 1993, Lecture Notes in Computer Science.
[5] C. D. Kemp,et al. Density Estimation for Statistics and Data Analysis , 1987 .