Learning nonlinear manifolds based on mixtures of localized linear manifolds under a self-organizing framework

This paper presents a neural model which learns low-dimensional nonlinear manifolds embedded in higher-dimensional data space based on mixtures of local linear manifolds under a self-organizing framework. Compared to other similar networks, the local linear manifolds learned by our network have a more localized representation of local data distributions thanks to a new distortion measure, which removes confusion between sub-models that exists in many similar mixture models. Each neuron in the network asymptotically learns a mean vector and a principal subspace of the data in its local region. It is proved that there is no local extremum for each sub-model. Experiments show that the new mixture model is better adapted to nonlinear manifolds of various data distributions than other similar models. The online-learning property of this model is desirable when the data set is very large, when computational efficiency is of paramount importance, or when data are sequentially input. We further show an application of this model to recognition of handwritten digit images based on mixtures of local linear manifolds.

[1]  Michel Verleysen,et al.  Mixtures of robust probabilistic principal component analyzers , 2008, ESANN.

[2]  Reza Ebrahimpour,et al.  View-independent face recognition with Mixture of Experts , 2008, Neurocomputing.

[3]  Wankou Yang,et al.  Two-directional maximum scatter difference discriminant analysis for face recognition , 2008, Neurocomputing.

[4]  Alejandro F. Frangi,et al.  Two-dimensional PCA: a new approach to appearance-based face representation and recognition , 2004 .

[5]  Xuelong Li,et al.  Tensor Rank One Discriminant Analysis - A convergent method for discriminative multilinear subspace selection , 2008, Neurocomputing.

[6]  David J. Kriegman,et al.  Acquiring linear subspaces for face recognition under variable lighting , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  H. Robbins A Stochastic Approximation Method , 1951 .

[8]  Michael I. Jordan,et al.  Hierarchical Mixtures of Experts and the EM Algorithm , 1994, Neural Computation.

[9]  Andrew W. Moore,et al.  Locally Weighted Learning , 1997, Artificial Intelligence Review.

[10]  Daoqiang Zhang,et al.  (2D)2PCA: Two-directional two-dimensional PCA for efficient face representation and recognition , 2005, Neurocomputing.

[11]  Jian Yang,et al.  An approach for directly extracting features from matrix data and its application in face recognition , 2008, Neurocomputing.

[12]  Zhang Yi,et al.  A new local PCA-SOM algorithm , 2008, Neurocomputing.

[13]  Erkki Oja,et al.  Principal components, minor components, and linear neural networks , 1992, Neural Networks.

[14]  Erkki Oja,et al.  Subspace Dimension Selection and Averaged Learning Subspace Method in Handwritten Digit Classification , 1996, ICANN.

[15]  Xuelong Li,et al.  General Tensor Discriminant Analysis and Gabor Features for Gait Recognition , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Chong Wang,et al.  Links between PPCA and subspace methods for complete Gaussian density estimation , 2006, IEEE Transactions on Neural Networks.

[17]  Yee Whye Teh,et al.  Automatic Alignment of Local Representations , 2002, NIPS.

[18]  Padraig Cunningham,et al.  Learning multiple linear manifolds with self-organizing networks , 2007, Int. J. Parallel Emergent Distributed Syst..

[19]  Hong Yan,et al.  Handwritten digit recognition by adaptive-subspace self-organizing map (ASSOM) , 1999, IEEE Trans. Neural Networks.

[20]  Juan Miguel Ortiz-de-Lazcano-Lobato,et al.  Principal Components Analysis Competitive Learning , 2003, Neural Computation.

[21]  Josef Kittler,et al.  The Adaptive Subspace Map for Image Description and Image Database Retrieval , 2000, SSPR/SPR.

[22]  Xuelong Li,et al.  Supervised Tensor Learning , 2005, ICDM.

[23]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[24]  Christopher M. Bishop,et al.  Mixtures of Probabilistic Principal Component Analyzers , 1999, Neural Computation.

[25]  Bao-Gang Hu,et al.  An adaptive fuzzy c-means clustering-based mixtures of experts model for unlabeled data classification , 2008, Neurocomputing.

[26]  Huicheng Zheng,et al.  Fast-Learning Adaptive-Subspace Self-Organizing Map: An Application to Saliency-Based Invariant Image Feature Construction , 2008, IEEE Transactions on Neural Networks.

[27]  Zhi-Qiang Liu Adaptive Subspace Self-Organizing Map and Its Applications in Face Recognition , 2002, Int. J. Image Graph..

[28]  Madalina Olteanu,et al.  Estimating the Number of Components in a Mixture of Multilayer Perceptrons , 2008, ESANN.

[29]  Zoubin Ghahramani,et al.  A Unifying Review of Linear Gaussian Models , 1999, Neural Computation.

[30]  Geoffrey E. Hinton,et al.  Modeling the manifolds of images of handwritten digits , 1997, IEEE Trans. Neural Networks.

[31]  T. Hastie Principal Curves and Surfaces , 1984 .

[32]  Christopher M. Bishop,et al.  GTM: The Generative Topographic Mapping , 1998, Neural Computation.

[33]  Samuel Kaski,et al.  Self-Organized Formation of Various Invariant-Feature Filters in the Adaptive-Subspace SOM , 1997, Neural Computation.

[34]  Yi Ma,et al.  Minimum effective dimension for mixtures of subspaces: a robust GPCA algorithm and its applications , 2004, CVPR 2004.

[35]  Javier Ruiz-del-Solar,et al.  TEXSOM: Texture segmentation using self-organizing maps , 1998, Neurocomputing.

[36]  Pedro J. Zufiria,et al.  On the discrete-time dynamics of the basic Hebbian neural network node , 2002, IEEE Trans. Neural Networks.

[37]  Nanda Kambhatla,et al.  Dimension Reduction by Local Principal Component Analysis , 1997, Neural Computation.

[38]  M. Kramer Nonlinear principal component analysis using autoassociative neural networks , 1991 .

[39]  Qiuqi Ruan,et al.  Palmprint recognition using Gabor feature-based (2D)2PCA , 2008, Neurocomputing.

[40]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[41]  Lawrence K. Saul,et al.  Think Globally, Fit Locally: Unsupervised Learning of Low Dimensional Manifold , 2003, J. Mach. Learn. Res..

[42]  Igor Farkas,et al.  Syntactic systematicity in sentence processing with a recurrent self-organizing network , 2008, Neurocomputing.

[43]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[44]  Marc M. Van Hulle,et al.  Differential Log Likelihood for Evaluating and Learning Gaussian Mixtures , 2006, Neural Computation.

[45]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[46]  M. Turk,et al.  Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[47]  T. Kohonen,et al.  The subspace learning algorithm as a formalism for pattern recognition and neural networks , 1988, IEEE 1988 International Conference on Neural Networks.