Non-linear dimensionality reduction techniques for unsupervised feature extraction

Dimensionality reduction techniques have been regularly used for visualization of high-dimensional data sets. In this paper, reduction to d >= 2 is studied, with the purpose of feature extraction. Four different non-linear techniques are studied: multidimensional scaling, Sammon's mapping, self-organizing maps and auto-associative feedforward networks. All four techniques will be presented in the same framework of optimization. A comparison with respect to feature extraction is made by evaluating the reduced feature sets ability to perform classification tasks. The experiments involve an artificial data set and grey-level and color texture data sets. We demonstrate the usefulness of non-linear techniques compared to linear feature extraction.

[1]  Bidyut Baran Chaudhuri,et al.  An efficient approach to estimate fractal dimension of textural images , 1992, Pattern Recognit..

[2]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[3]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[4]  M. Kramer Nonlinear principal component analysis using autoassociative neural networks , 1991 .

[5]  Teuvo Kohonen,et al.  The self-organizing map , 1990 .

[6]  Trevor F. Cox,et al.  Metric multidimensional scaling , 2000 .

[7]  Joseph B. Kruskal Comments on "A Nonlinear Mapping for Data Structure Analysis" , 1971, IEEE Trans. Computers.

[8]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[9]  Yoshua Bengio,et al.  Pattern Recognition and Neural Networks , 1995 .

[10]  Paul Scheunders,et al.  Rotation-invariant texture segmentation using continuous wavelets , 1997 .

[11]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[12]  John W. Sammon,et al.  A Nonlinear Mapping for Data Structure Analysis , 1969, IEEE Transactions on Computers.

[13]  Paul Scheunders,et al.  Color Texture Classification by Wavelet Energy Correlation Signatures , 1997, ICIAP.

[14]  Jorge Nocedal,et al.  On the limited memory BFGS method for large scale optimization , 1989, Math. Program..

[15]  Josef Kittler,et al.  Floating search methods in feature selection , 1994, Pattern Recognit. Lett..

[16]  Josef Kittler,et al.  Pattern recognition : a statistical approach , 1982 .

[17]  Kurt Hornik,et al.  Neural networks and principal component analysis: Learning from examples without local minima , 1989, Neural Networks.

[18]  David A. Landgrebe,et al.  Decision boundary feature extraction for neural networks , 1997, IEEE Trans. Neural Networks.

[19]  R. Shepard The analysis of proximities: Multidimensional scaling with an unknown distance function. II , 1962 .

[20]  Anil K. Jain,et al.  A nonlinear projection method based on Kohonen's topology preserving maps , 1992, IEEE Trans. Neural Networks.

[21]  H. D. Brunk,et al.  Statistical inference under order restrictions : the theory and application of isotonic regression , 1973 .

[22]  Anthony Ralston,et al.  Statistical Methods for Digital Computers. , 1980 .

[23]  Jan de Leeuw,et al.  Nonlinear Principal Component Analysis , 1982 .

[24]  R. Shepard The analysis of proximities: Multidimensional scaling with an unknown distance function. I. , 1962 .

[25]  Anil K. Jain,et al.  Artificial neural networks for feature extraction and multivariate data projection , 1995, IEEE Trans. Neural Networks.

[26]  J. Kruskal Nonmetric multidimensional scaling: A numerical method , 1964 .

[27]  I. White Comment on "A Nonlinear Mapping for Data Structure Analysis" , 1972, IEEE Trans. Computers.