An Improved Hierarchical Dirichlet Process-Hidden Markov Model and Its Application to Trajectory Modeling and Retrieval

In this paper, we propose a hierarchical Bayesian model, an improved hierarchical Dirichlet process-hidden Markov model (iHDP-HMM), for visual document analysis. The iHDP-HMM is capable of clustering visual documents and capturing the temporal correlations between the visual words within a visual document while identifying the number of document clusters and the number of visual topics adaptively. A Bayesian inference mechanism for the iHDP-HMM is developed to carry out likelihood evaluation, topic estimation, and cluster membership prediction. We apply the iHDP-HMM to simultaneously cluster motion trajectories and discover latent topics for trajectory words, based on the proposed method for constructing the trajectory word codebook. Then, an iHDP-HMM-based probabilistic trajectory retrieval framework is developed. The experimental results verify the clustering accuracy of the iHDP-HMM and trajectory retrieval accuracy of the proposed framework.

[1]  Mohan M. Trivedi,et al.  Learning, Modeling, and Classification of Vehicle Track Patterns from Live Video , 2008, IEEE Transactions on Intelligent Transportation Systems.

[2]  Carl E. Rasmussen,et al.  Factorial Hidden Markov Models , 1997 .

[3]  Michael I. Jordan,et al.  An HDP-HMM for systems with state persistence , 2008, ICML '08.

[4]  Cheng-Lin Liu,et al.  Online Japanese Character Recognition Using Trajectory-Based Normalization and Direction Feature Extraction , 2006 .

[5]  Monique Thonnat,et al.  Trajectory-Based Video Indexing and Retrieval Enabling Relevance Feedback , 2006 .

[6]  Michael I. Jordan,et al.  Variational methods for the Dirichlet process , 2004, ICML.

[7]  M. Trivedi,et al.  Learning trajectory patterns by clustering: Experimental studies and comparative evaluation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Eamonn J. Keogh,et al.  Scaling up dynamic time warping for datamining applications , 2000, KDD '00.

[9]  Jae-Woo Chang,et al.  Spatio-temporal representation and retrieval using moving object's trajectories , 2000, MULTIMEDIA '00.

[10]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[11]  Masaki Nakagawa,et al.  'Online recognition of Chinese characters: the state-of-the-art , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Nikolaos Papanikolopoulos,et al.  Learning to Recognize Video-Based Spatiotemporal Events , 2009, IEEE Transactions on Intelligent Transportation Systems.

[13]  Dimitrios Gunopulos,et al.  Discovering similar multidimensional trajectories , 2002, Proceedings 18th International Conference on Data Engineering.

[14]  Juan Carlos Niebles,et al.  Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words , 2006, BMVC.

[15]  Lei Chen,et al.  Symbolic representation and retrieval of moving object trajectories , 2004, MIR '04.

[16]  Fei-Fei Li,et al.  OPTIMOL: Automatic Online Picture Collection via Incremental Model Learning , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Sukhendu Das,et al.  Combining Features for Shape and Motion Trajectory of Video Objects for Efficient Content Based Video Retrieval , 2009, 2009 Seventh International Conference on Advances in Pattern Recognition.

[18]  Vladimir Pavlovic,et al.  Discovering clusters in motion time-series data , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[19]  Dan Schonfeld,et al.  Event Analysis Based on Multiple Interactive Motion Trajectories , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[20]  W. Eric L. Grimson,et al.  Unsupervised Activity Perception by Hierarchical Bayesian Models , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Francesco G. B. De Natale,et al.  Syntactic Matching of Trajectories for Ambient Intelligence Applications , 2009, IEEE Transactions on Multimedia.

[22]  Forouzan Golshani,et al.  Motion recovery for video content classification , 1995, TOIS.

[23]  David C. Hogg,et al.  Learning the Distribution of Object Trajectories for Event Recognition , 1995, BMVC.

[24]  Soraia Raupp Musse,et al.  Event Detection Using Trajectory Clustering and 4-D Histograms , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[25]  S. MacEachern,et al.  Estimating mixture of dirichlet process models , 1998 .

[26]  Alexei A. Efros,et al.  Discovering objects and their location in images , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[27]  D. Blackwell,et al.  Ferguson Distributions Via Polya Urn Schemes , 1973 .

[28]  Nikolaos Papanikolopoulos,et al.  Clustering of Vehicle Trajectories , 2010, IEEE Transactions on Intelligent Transportation Systems.

[29]  J. Lafferty,et al.  Time-Sensitive Dirichlet Process Mixture Models , 2005 .

[30]  T. Ferguson A Bayesian Analysis of Some Nonparametric Problems , 1973 .

[31]  Shehzad Khalid,et al.  Motion Trajectory Learning in the DFT-Coefficient Feature Space , 2006, Fourth IEEE International Conference on Computer Vision Systems (ICVS'06).

[32]  James J. Little,et al.  Video retrieval by spatial and temporal structure of trajectories , 2001, IS&T/SPIE Electronic Imaging.

[33]  Radford M. Neal Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .

[34]  Dan Schonfeld,et al.  Real-Time Motion Trajectory-Based Indexing and Retrieval of Video Sequences , 2007, IEEE Transactions on Multimedia.

[35]  Matthew J. Beal,et al.  Gene Expression Time Course Clustering with Countably Infinite Hidden Markov Models , 2006, UAI.

[36]  Yihong Gong,et al.  Trend Analysis for Large Document Streams , 2006, 2006 5th International Conference on Machine Learning and Applications (ICMLA'06).

[37]  Lei Chen,et al.  Robust and fast similarity search for moving object trajectories , 2005, SIGMOD '05.

[38]  Jun-Wei Hsieh,et al.  Motion-based video retrieval by trajectory matching , 2006, IEEE Transactions on Circuits and Systems for Video Technology.

[39]  Xiaoqin Zhang,et al.  Trajectory-Based Video Retrieval Using Dirichlet Process Mixture Models , 2008, BMVC.

[40]  Chu-Song Chen,et al.  Two-View Motion Segmentation with Model Selection and Outlier Removal by RANSAC-Enhanced Dirichlet Process Mixture Models , 2010, International Journal of Computer Vision.

[41]  Gang Wang,et al.  Using Dependent Regions for Object Categorization in a Generative Framework , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[42]  Tieniu Tan,et al.  Comparison of Similarity Measures for Trajectory Clustering in Outdoor Surveillance Scenes , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[43]  W. Eric L. Grimson,et al.  Spatial Latent Dirichlet Allocation , 2007, NIPS.

[44]  Dimitrios Gunopulos,et al.  Indexing Multidimensional Time-Series , 2004, The VLDB Journal.

[45]  Sukhendu Das,et al.  Spatio-temporal Descriptor Using 3D Curvature Scale Space , 2007, PReMI.

[46]  Sukhendu Das,et al.  MST-CSS (Multi-Spectro-Temporal Curvature Scale Space), a Novel Spatio-Temporal Representation for Content-Based Video Retrieval , 2010, IEEE Transactions on Circuits and Systems for Video Technology.

[47]  Harry Shum,et al.  Bidirectional tracking using trajectory segment analysis , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[48]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[49]  Luc Van Gool,et al.  What's going on? Discovering spatio-temporal dependencies in dynamic scenes , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[50]  Yee Whye Teh,et al.  Sharing Clusters among Related Groups: Hierarchical Dirichlet Processes , 2004, NIPS.

[51]  Cina Motamed Video indexing based on object motion for video-surveillance context , 2000, RIAO.

[52]  Mubarak Shah,et al.  Probabilistic Modeling of Scene Dynamics for Applications in Visual Surveillance , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[53]  Dan Schonfeld,et al.  A hybrid system for affine-invariant trajectory retrieval , 2004, MIR '04.

[54]  Mohan M. Trivedi,et al.  A Survey of Vision-Based Trajectory Learning and Analysis for Surveillance , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[55]  W. Eric L. Grimson,et al.  Learning Semantic Scene Models by Trajectory Analysis , 2006, ECCV.

[56]  Monique Thonnat,et al.  Subtrajectory-Based Video Indexing and Retrieval , 2007, MMM.

[57]  Robert M. Gray,et al.  An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..

[58]  Michael I. Jordan,et al.  Learning Multiscale Representations of Natural Scenes Using Dirichlet Processes , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[59]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[60]  Ilan Shimshoni,et al.  Mean shift based clustering in high dimensions: a texture classification example , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[61]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).