To read a hand-written digit string, it is helpful to segment the image into separate digits. Bottom-up segmentation heuristics often fail when neighboring digits overlap substantially. We describe a system that has a stochastic generative model of each digit class and we show that this is the only knowledge required for segmentation. The system uses Gibbs sampling to construct a perceptual interpretation of a digit string and segmentation arises naturally from the \explaining away" e ects that occur during Bayesian inference. By using conditional mixtures of factor analyzers, it is possible to extract an explicit, compact representation of the instantiation parameters that describe the pose of each digit. These instantiation parameters can then be used as the inputs to a higher level system that models the relationships between digits. The same technique could be used to model individual digits as redundancies between the instantiation parameters of their parts.
[1]
Christopher M. Bishop,et al.
GTM: The Generative Topographic Mapping
,
1998,
Neural Computation.
[2]
Geoffrey E. Hinton,et al.
Learning Population Codes by Minimizing Description Length
,
1993,
Neural Computation.
[3]
Geoffrey E. Hinton,et al.
Modeling the manifolds of images of handwritten digits
,
1997,
IEEE Trans. Neural Networks.
[4]
Yann LeCun,et al.
Efficient Pattern Recognition Using a New Transformation Distance
,
1992,
NIPS.
[5]
Geoffrey E. Hinton,et al.
Using Generative Models for Handwritten Digit Recognition
,
1996,
IEEE Trans. Pattern Anal. Mach. Intell..