Learning object segmentation from video data

This memo describes the initial results of a project to create a self-supervised algorithm for learning object segmentation from video data. Developmental psychology and computational experience have demonstrated that the motion segmentation of objects is a simpler, more primitive process than the detection of object boundaries by static image cues. Therefore, motion information provides a plausible supervision signal for learning the static boundary detection task and for evaluating performance on a test set. A video camera and previously developed background subtraction algorithms can automatically produce a large database of motion-segmented images for minimal cost. The purpose of this work is to use the information in such a database to learn how to detect the object boundaries in novel images using static information, such as color, texture, and shape. This work was funded in part by the Office of Naval Research contract #N0001400-1-0298, in part by the Singapore-MIT Alliance agreement of 11/6/98, and in part by a National Science Foundation Graduate Student Fellowship.

[1]  Alan L. Yuille,et al.  Fundamental bounds on edge detection: an information theoretic evaluation of different edge cues , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[2]  S. Palmer Vision Science : Photons to Phenomenology , 1999 .

[3]  Jan-Olof Eklundh,et al.  Probabilistic and Voting Approaches to Cue Integration for Figure-Ground Segmentation , 2002, ECCV.

[4]  Shimon Ullman,et al.  Structural Saliency: The Detection Of Globally Salient Structures using A Locally Connected Network , 1988, [1988 Proceedings] Second International Conference on Computer Vision.

[5]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Alan L. Yuille,et al.  Statistical Edge Detection: Learning and Evaluating Edge Cues , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Martin J. Wainwright,et al.  Tree-reweighted belief propagation algorithms and approximate ML estimation by pseudo-moment matching , 2003, AISTATS.

[8]  W. Eric L. Grimson,et al.  Adaptive background mixture models for real-time tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[9]  Pietro Perona,et al.  Unsupervised Learning of Models for Recognition , 2000, ECCV.

[10]  Yair Weiss,et al.  Interpreting Images by Propagating Bayesian Beliefs , 1996, NIPS.

[11]  Pedro F. Felzenszwalb,et al.  Efficiently computing a good segmentation , 1998 .

[12]  Alan L. Yuille,et al.  Region Competition: Unifying Snakes, Region Growing, and Bayes/MDL for Multiband Image Segmentation , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Christopher K. I. Williams,et al.  Combining Belief Networks and Neural Networks for Scene Segmentation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Jitendra Malik,et al.  Learning to detect natural image boundaries using local brightness, color, and texture cues , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Shimon Ullman,et al.  Class-Specific, Top-Down Segmentation , 2002, ECCV.

[16]  E. Spelke,et al.  Object perception, object-directed action, and physical knowledge in infancy , 1995 .

[17]  Martin J. Wainwright,et al.  Tree-based reparameterization for approximate inference on loopy graphs , 2001, NIPS.

[18]  Donald Geman,et al.  Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images , 1984 .

[19]  William T. Freeman,et al.  Learning low-level vision , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[20]  Takeo Kanade,et al.  A statistical method for 3D object detection applied to faces and cars , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[21]  Jitendra Malik,et al.  A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[22]  Paul M. Fitzpatrick,et al.  Developmentally deep perceptual system for a humanoid robot , 2003 .

[23]  Stig K. Andersen,et al.  Probabilistic reasoning in intelligent systems: Networks of plausible inference , 1991 .

[24]  Edward H. Adelson,et al.  Belief Propagation and Revision in Networks with Loops , 1997 .

[25]  T Poggio,et al.  Parallel integration of vision modules. , 1988, Science.

[26]  P. F. Felzenzwalb Efficiently computing a good segmentation , 1998 .

[27]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.