Occlusive Components Analysis

We study unsupervised learning in a probabilistic generative model for occlusion. The model uses two types of latent variables: one indicates which objects are present in the image, and the other how they are ordered in depth. This depth order then determines how the positions and appearances of the objects present, specified in the model parameters, combine to form the image. We show that the object parameters can be learnt from an unlabelled set of images in which objects occlude one another. Exact maximum-likelihood learning is intractable. However, we show that tractable approximations to Expectation Maximization (EM) can be found if the training images each contain only a small number of objects on average. In numerical experiments it is shown that these approximations recover the correct set of object parameters. Experiments on a novel version of the bars test using colored bars, and experiments on more realistic data, show that the algorithm performs well in extracting the generating causes. Experiments based on the standard bars benchmark test for object learning show that the algorithm performs well in comparison to other recent component extraction approaches. The model and the learning algorithm thus connect research on occlusion with the research field of multiple-causes component extraction methods.

[1]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[2]  Christopher K. I. Williams,et al.  Greedy Learning of Multiple Objects in Images Using Robust Statistics and Factorial Learning , 2004, Neural Computation.

[3]  Geoffrey E. Hinton,et al.  A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.

[4]  Kunihiko Fukushima,et al.  Restoring partly occluded patterns: a neural network model , 2005, Neural Networks.

[5]  Christian Wolff,et al.  A recurrent dynamic model for correspondence-based face recognition. , 2008, Journal of vision.

[6]  Naonori Ueda,et al.  Deterministic annealing EM algorithm , 1998, Neural Networks.

[7]  P. Földiák,et al.  Forming sparse representations by local anti-Hebbian learning , 1990, Biological Cybernetics.

[8]  Richard A. Andersen,et al.  Latent variable models for neural data analysis , 1999 .

[9]  Patrik O. Hoyer,et al.  Non-negative Matrix Factorization with Sparseness Constraints , 2004, J. Mach. Learn. Res..

[10]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[11]  Heiko Wersing,et al.  Learning Optimized Features for Hierarchical Models of Invariant Object Recognition , 2003, Neural Computation.

[12]  Brendan J. Frey,et al.  Learning flexible sprites in video layers , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[13]  Aapo Hyvärinen,et al.  Learning Natural Image Structure with a Horizontal Product Model , 2009, ICA.

[14]  J. Eggert,et al.  Transformation-invariant representation and NMF , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[15]  Sameer A. Nene,et al.  Columbia Object Image Library (COIL100) , 1996 .

[16]  Jörg Lücke,et al.  Maximal Causes for Non-linear Component Extraction , 2008, J. Mach. Learn. Res..

[17]  Jürgen Schmidhuber,et al.  Feature Extraction Through LOCOCODE , 1999, Neural Computation.

[18]  Jochen Triesch,et al.  Analysis of Cluttered Scenes Using an Elastic Matching Approach for Stereo Images , 2006, Neural Computation.

[19]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[20]  Michael W. Spratling Learning Image Components for Object Recognition , 2006, J. Mach. Learn. Res..