Identifying Surprising Events in Video Using Bayesian Topic Models

Automatic processing of video data is essential in order to allow efficient access to large amounts of video content, a crucial point in such applications as video mining and surveillance. In this paper we focus on the problem of identifying interesting parts of the video. Specifically, we seek to identify atypical video events, which are the events a human user is usually looking for. To this end we employ the notion of Bayesian surprise, as defined in [1, 2], in which an event is considered surprising if its occurrence leads to a large change in the probability of the world model. We propose to compute this abstract measure of surprise by first modeling a corpus of video events using the Latent Dirichlet Allocation model. Subsequently, we measure the change in the Dirichlet prior of the LDA model as a result of each video event's occurrence. This change of the Dirichlet prior leads to a closed form expression for an event's level of surprise, which can then be inferred directly from the observed data. We tested our algorithm on a real dataset of video data, taken by a camera observing an urban street intersection. The results demonstrate our ability to detect atypical events, such as a car making a U-turn or a person crossing an intersection diagonally.

[1]  Yael Pritch,et al.  This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2008 1 Non-Chronological Video , 2022 .

[2]  Pierre Baldi,et al.  A principled approach to detecting surprising events in video , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[3]  Jürgen Schmidhuber,et al.  Driven by Compression Progress: A Simple Principle Explains Essential Aspects of Subjective Beauty, Novelty, Surprise, Interestingness, Attention, Curiosity, Creativity, Art, Science, Music, Jokes , 2008, ABiALS.

[4]  Thomas L. Griffiths,et al.  The nested chinese restaurant process and bayesian nonparametric inference of topic hierarchies , 2007, JACM.

[5]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[6]  Yael Pritch,et al.  Clustered Synopsis of Surveillance Video , 2009, 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance.

[7]  Frank Dellaert,et al.  Bayesian surprise and landmark detection , 2009, 2009 IEEE International Conference on Robotics and Automation.

[8]  Antti Oulasvirta,et al.  Computer Vision – ECCV 2006 , 2006, Lecture Notes in Computer Science.

[9]  Frank Dellaert,et al.  Automatic Landmark Detection for Topological Mapping Using Bayesian Surprise , 2008 .

[10]  Ronald G. Hughes,et al.  EVALUATION OF AUTOMATED PEDESTRIAN DETECTION AT SIGNALIZED INTERSECTIONS , 2000 .

[11]  W. Eric L. Grimson,et al.  Unsupervised Activity Perception by Hierarchical Bayesian Models , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Michal Irani,et al.  Detecting Irregularities in Images and in Video , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[13]  Shaogang Gong,et al.  A Markov Clustering Topic Model for mining behaviour in video , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[14]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[15]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[16]  Alexei A. Efros,et al.  Discovering objects and their location in images , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[17]  W. James MacLean Spatial Coherence for Visual Motion Analysis , 2006 .

[18]  Juan Carlos Niebles,et al.  Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words , 2008, International Journal of Computer Vision.

[19]  Jintao Li,et al.  Hierarchical spatio-temporal context modeling for action recognition , 2009, CVPR.

[20]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[21]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Analysis , 1999, UAI.

[22]  Martin V. Butz,et al.  Anticipatory Behavior in Adaptive Learning Systems , 2003, Lecture Notes in Computer Science.

[23]  Harry Shum,et al.  Background Cut , 2006, ECCV.

[24]  Ivan Laptev,et al.  Local Descriptors for Spatio-temporal Recognition , 2004, SCVMA.