Factor Analysis Using Batch and Online EM

A simple way to encode input patterns is to suppose that each input can be well-approximated by a linear combination of component vectors, where the amplitudes of the vectors are modulated to match the input. For a given training set, the most appropriate set of component vectors will depend on how we expect the modulation levels to behave and how we measure the distance between the input and its approximation. These effects can be captured by a generative probability model that specifies a distribution p(z) over modulation levels z = (z1; : : : ; zK) and a distribution p(xjz) over sensors x = (x1; : : : ; xN) given the modulation levels. The linear combination constraint is given by E[xjz]= Λz; (1)

[1]  Geoffrey E. Hinton,et al.  The "wake-sleep" algorithm for unsupervised neural networks. , 1995, Science.

[2]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[3]  Ali Mansour,et al.  Blind Separation of Sources , 1999 .

[4]  Barak A. Pearlmutter,et al.  Maximum Likelihood Blind Source Separation: A Context-Sensitive Generalization of ICA , 1996, NIPS.

[5]  Ralph Linsker,et al.  Self-organization in a perceptual network , 1988, Computer.

[6]  Terrence J. Sejnowski,et al.  An Information-Maximization Approach to Blind Separation and Blind Deconvolution , 1995, Neural Computation.

[7]  Geoffrey E. Hinton,et al.  Generative models for discovering sparse distributed representations. , 1997, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[8]  Geoffrey E. Hinton,et al.  The Helmholtz Machine , 1995, Neural Computation.

[9]  Geoffrey E. Hinton,et al.  Learning and relearning in Boltzmann machines , 1986 .

[10]  Hagai Attias,et al.  Independent Factor Analysis , 1999, Neural Computation.

[11]  Erkki Oja,et al.  Neural Networks, Principal Components, and Subspaces , 1989, Int. J. Neural Syst..

[12]  Brian Everitt,et al.  An Introduction to Latent Variable Models , 1984 .

[13]  Andrzej Cichocki,et al.  A New Learning Algorithm for Blind Signal Separation , 1995, NIPS.

[14]  M. Turk,et al.  Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[15]  Zoubin Ghahramani,et al.  A Unifying Review of Linear Gaussian Models , 1999, Neural Computation.

[16]  Radford M. Neal Connectionist Learning of Belief Networks , 1992, Artif. Intell..

[17]  Dorothy T. Thayer,et al.  EM algorithms for ML factor analysis , 1982 .

[18]  R. Fletcher Practical Methods of Optimization , 1988 .

[19]  Christopher M. Bishop,et al.  Mixtures of Probabilistic Principal Component Analyzers , 1999, Neural Computation.

[20]  Brendan J. Frey,et al.  Continuous Sigmoidal Belief Networks Trained using Slice Sampling , 1996, NIPS.

[21]  Geoffrey E. Hinton,et al.  A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.

[22]  Brendan J. Frey,et al.  Variational Learning in Nonlinear Gaussian Belief Networks , 1999, Neural Computation.

[23]  Brendan J. Frey,et al.  Graphical Models for Machine Learning and Digital Communication , 1998 .