Image Segmentation in Video Sequences: A Probabilistic Approach

"Background subtraction" is an old technique for finding moving objects in a video sequence--for example, cars driving on a freeway. The idea is that subtracting the current image from a time-averaged background image will leave only nonstationary objects. It is, however, a crude approximation to the task of classifying each pixel of the current image; it fails with slow-moving objects and does not distinguish shadows from moving objects. The basic idea of this paper is that we can classify each pixel using a model of how that pixel looks when it is part of different classes. We learn a mixture-of-Gaussians classification model for each pixel using an unsupervised technique--an efficient, incremental version of EM. Unlike the standard image-averaging approach, this automatically updates the mixture component for each class according to likelihood of membership; hence slow-moving objects are handled perfectly. Our approach also identifies and eliminates shadows much more effectively than other techniques such as thresholding. Application of this method as part of the Roadwatch traffic surveillance project is expected to result in significant improvements in vehicle identification and tracking.

[1]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[2]  P. G. Michalopoulos,et al.  Vehicle detection video through image processing: the Autoscope system , 1991 .

[3]  Steven J. Nowlan,et al.  Soft competitive adaptation: neural network learning algorithms based on fitting statistical mixtures , 1991 .

[4]  M. Kilger,et al.  A shadow handler in a video-based real-time traffic monitoring system , 1992, [1992] Proceedings IEEE Workshop on Applications of Computer Vision.

[5]  Radford M. Neal A new view of the EM algorithm that justifies incremental and other variants , 1993 .

[6]  Michael J. Black,et al.  Mixture models for optical flow computation , 1993, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Jitendra Malik,et al.  Towards robust automatic traffic scene analysis in real-time , 1994, Proceedings of 12th International Conference on Pattern Recognition.

[8]  Jitendra Malik,et al.  Automatic Symbolic Traffic Scene Analysis Using Belief Networks , 1994, AAAI.

[9]  Ramin Samadani,et al.  A finite mixtures algorithm for finding proportions in SAR images , 1995, IEEE Trans. Image Process..

[10]  Harpreet S. Sawhney,et al.  Compact Representations of Videos Through Dominant and Multiple Motion Estimation , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[12]  Edward H. Adelson,et al.  A unified mixture framework for motion segmentation: incorporating spatial coherence and estimating the number of models , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.