A simple, effective way to model images is to represent each input pattern by a linear combination of "component" vectors, where the amplitudes of the vectors are modulated to match the input. This approach includes principal component analysis, independent component analysis and factor analysis. In practice, images are subjected to randomly selected transformations of a known nature, such as translation and rotation. Direct use of the above methods will lead to severely blurred components that tend to ignore the more interesting and useful structure. In previous work, we introduced a clustering algorithm that is invariant to transformations. In this paper, we propose a method called transformed component analysis, which incorporates a discrete, hidden variable that accounts for transformations and uses the expectation maximization algorithm to jointly extract components and normalize for transformations. We illustrate the algorithm using a shading problem, facial expression modeling and written digit recognition.
[1]
Terrence J. Sejnowski,et al.
An Information-Maximization Approach to Blind Separation and Blind Deconvolution
,
1995,
Neural Computation.
[2]
Jonathan J. Hull,et al.
A Database for Handwritten Text Recognition Research
,
1994,
IEEE Trans. Pattern Anal. Mach. Intell..
[3]
Brian Everitt,et al.
An Introduction to Latent Variable Models
,
1984
.
[4]
Brendan J. Frey,et al.
Estimating mixture models of images and inferring spatial transformations using the EM algorithm
,
1999,
Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).
[5]
Dorothy T. Thayer,et al.
EM algorithms for ML factor analysis
,
1982
.