Motion clustering for similar video segments mining

To discover similar video segments from surveillance video sequence, a new approach is proposed for clustering motion data of moving objects. A simple background subtraction algorithm is used to get the binary mask of moving objects for segmenting the video sequence captured by fixed camera. Then a mixture of hidden Markov models (HMMs) using the expectation-maximization (EM) scheme is fitted to the motion data extracted from the binary mask. Unlike previous literatures using k-means where every observed data set only assigned to a single HMM, the proposed approach allows every video segment to belong to more than a single HMM with some probability. Experiments with real data demonstrate the benefit when there is more "overlap" in the processes generating the data. The promising potential of HMM-based motion clustering for mining similar video segments from surveillance video is also indicated by the experimental results

[1]  David C. Hogg,et al.  Learning the distribution of object trajectories for event recognition , 1996, Image Vis. Comput..

[2]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[3]  Lihi Zelnik-Manor,et al.  Event-based analysis of video , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[4]  L. Baum,et al.  A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains , 1970 .

[5]  Gautam Biswas,et al.  A Bayesian Approach to Temporal Data Clustering using Hidden Markov Models , 2000, ICML.

[6]  W. Eric L. Grimson,et al.  Learning Patterns of Activity Using Real-Time Tracking , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Jorma Rissanen,et al.  Hypothesis Selection and Testing by the MDL Principle , 1999, Comput. J..

[8]  Christoph Bregler,et al.  Learning and recognizing human dynamics in video sequences , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[9]  Alex Pentland,et al.  Real-time American Sign Language recognition from video using hidden Markov models , 1995 .

[10]  Yishay Mansour,et al.  An Information-Theoretic Analysis of Hard and Soft Assignment Methods for Clustering , 1997, UAI.

[11]  Paul R. Cohen,et al.  Using Dynamic Time Warping to Bootstrap HMM-Based Clustering of Time Series , 2001, Sequence Learning.