Modeling, Clustering, and Segmenting Video with Mixtures of Dynamic Textures

A dynamic texture is a spatio-temporal generative model for video, which represents video sequences as observations from a linear dynamical system. This work studies the mixture of dynamic textures, a statistical model for an ensemble of video sequences that is sampled from a finite collection of visual processes, each of which is a dynamic texture. An expectation-maximization (EM) algorithm is derived for learning the parameters of the model, and the model is related to previous works in linear systems, machine learning, time- series clustering, control theory, and computer vision. Through experimentation, it is shown that the mixture of dynamic textures is a suitable representation for both the appearance and dynamics of a variety of visual processes that have traditionally been challenging for computer vision (for example, fire, steam, water, vehicle and pedestrian traffic, and so forth). When compared with state-of-the-art methods in motion segmentation, including both temporal texture methods and traditional representations (for example, optical flow or other localized motion representations), the mixture of dynamic textures achieves superior performance in the problems of clustering and segmenting video of such processes.

[1]  Nuno Vasconcelos,et al.  Minimum probability of error image retrieval , 2012, IEEE Transactions on Signal Processing.

[2]  James M. Rehg,et al.  Learning and Inferring Motion Patterns using Parametric Segmental Switching Linear Dynamic Systems , 2008, International Journal of Computer Vision.

[3]  René Vidal,et al.  Segmenting Dynamic Textures with Ising Descriptors, ARX Models and Level Sets , 2006, WDV.

[4]  Jun Liu,et al.  Spatial Segmentation of Temporal Texture Using Mixture Linear Models , 2006, WDV.

[5]  Nuno Vasconcelos,et al.  Layered Dynamic Textures , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  T. Warren Liao,et al.  Clustering of time series data - a survey , 2005, Pattern Recognit..

[7]  James M. Rehg,et al.  Learning and inference in parametric switching linear dynamic systems , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[8]  Nuno Vasconcelos,et al.  Mixtures of dynamic textures , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[9]  Dietmar Bauer,et al.  Comparing the CCA Subspace Method to Pseudo Maximum Likelihood Methods in the case of No Exogenous Inputs , 2005 .

[10]  René Vidal,et al.  Optical flow estimation & segmentation of multiple moving dynamic textures , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[11]  Nuno Vasconcelos,et al.  Probabilistic kernels for the classification of auto-regressive visual processes , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[12]  Dit-Yan Yeung,et al.  Time series clustering with ARMA mixtures , 2004, Pattern Recognit..

[13]  Daniel Cremers,et al.  Dynamic texture segmentation , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[14]  Gang Hua,et al.  Switching observation models for contour tracking in clutter , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[15]  Dorin Comaniciu,et al.  Kernel-Based Object Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Jitendra Malik,et al.  Blobworld: Image Segmentation Using Expectation-Maximization and Its Application to Image Querying , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Jean Ponce,et al.  Computer Vision: A Modern Approach , 2002 .

[18]  Dale E. Seborg,et al.  Clustering of multivariate time-series data , 2002, Proceedings of the 2002 American Control Conference (IEEE Cat. No.CH37301).

[19]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  Payam Saisan,et al.  Dynamic texture recognition , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[21]  A. Fitzgibbon Stochastic rigidity: image registration for nowhere-static scenes , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[22]  Nuno Vasconcelos,et al.  Empirical Bayesian Motion Segmentation , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[23]  W. Eric L. Grimson,et al.  Learning Patterns of Activity Using Real-Time Tracking , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  Geoffrey E. Hinton,et al.  Variational Learning for Switching State-Space Models , 2000, Neural Computation.

[25]  Vladimir Pavlovic,et al.  Time-series classification using mixed-state dynamic Bayesian networks , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[26]  Brendan J. Frey,et al.  Estimating mixture models of images and inferring spatial transformations using the EM algorithm , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[27]  Zoubin Ghahramani,et al.  A Unifying Review of Linear Gaussian Models , 1999, Neural Computation.

[28]  Michael Isard,et al.  CONDENSATION—Conditional Density Propagation for Visual Tracking , 1998, International Journal of Computer Vision.

[29]  Robert H. Shumway,et al.  Discrimination and Clustering for Multivariate Time Series , 1998 .

[30]  Jitendra Malik,et al.  Motion segmentation and tracking using normalized cuts , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[31]  Michael Isard,et al.  A mixed-state condensation tracker with automatic model-switching , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[32]  Yair Weiss,et al.  Smoothness in layers: Motion segmentation using nonparametric mixture estimation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[33]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[34]  Kumpati S. Narendra,et al.  Adaptive control using multiple models , 1997, IEEE Trans. Autom. Control..

[35]  Geoffrey E. Hinton,et al.  Parameter estimation for linear dynamical systems , 1996 .

[36]  Harpreet S. Sawhney,et al.  Compact Representations of Videos Through Dominant and Multiple Motion Estimation , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[37]  Geoffrey E. Hinton,et al.  The EM algorithm for mixtures of factor analyzers , 1996 .

[38]  Steve Young,et al.  The HTK book , 1995 .

[39]  Kristin J. Dana,et al.  Real-time scene stabilization and mosaic construction , 1994, Proceedings of 1994 IEEE Workshop on Applications of Computer Vision.

[40]  Edward H. Adelson,et al.  Representing moving images with layers , 1994, IEEE Trans. Image Process..

[41]  Bart De Moor,et al.  N4SID: Subspace algorithms for the identification of combined deterministic-stochastic systems , 1994, Autom..

[42]  J. R. Rohlicek,et al.  ML estimation of a stochastic linear system with the EM algorithm and its application to speech recognition , 1993, IEEE Trans. Speech Audio Process..

[43]  Michael J. Black,et al.  Mixture models for optical flow computation , 1993, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[44]  S. Kay Fundamentals of statistical signal processing: estimation theory , 1993 .

[45]  David J. Fleet,et al.  Performance of optical flow techniques , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[46]  Michal Irani,et al.  Detecting and Tracking Multiple Moving Objects Using Temporal Integration , 1992, ECCV.

[47]  P. Anandan,et al.  Hierarchical Model-Based Motion Estimation , 1992, ECCV.

[48]  R. Shumway,et al.  Dynamic linear models with switching , 1991 .

[49]  Berthold K. P. Horn Robot vision , 1986, MIT electrical engineering and computer science series.

[50]  R. Brown,et al.  A new look at the Magill adaptive filter as a practical means of multiple hypothesis testing , 1983 .

[51]  R. Shumway,et al.  AN APPROACH TO TIME SERIES SMOOTHING AND FORECASTING USING THE EM ALGORITHM , 1982 .

[52]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[53]  Berthold K. P. Horn,et al.  Determining Optical Flow , 1981, Other Conferences.

[54]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[55]  D. Lainiotis,et al.  Partitioning: A unifying framework for adaptive systems, I: Estimation , 1976, Proceedings of the IEEE.

[56]  D. Magill Optimal adaptive estimation of sampled stochastic processes , 1965 .

[57]  Finn V. Jensen,et al.  Bayesian Networks and Decision Graphs , 2001, Statistics for Engineering and Information Science.

[58]  Vladimir Pavlovic,et al.  Learning Switching Linear Models of Human Motion , 2000, NIPS.

[59]  Chang‐Jin Kim,et al.  Dynamic linear models with Markov-switching , 1994 .

[60]  David G. Stork,et al.  Pattern Classification , 1973 .