Identifying Surprising Events in Videos Using Bayesian Topic Models

Automatic processing of video data is essential in order to allow efficient access to large amounts of video content, a crucial point in such applications as video mining and surveillance. In this paper we focus on the problem of identifying interesting parts of the video. Specifically, we seek to identify atypical video events, which are the events a human user is usually looking for. To this end we employ the notion of Bayesian surprise, as defined in [1, 2], in which an event is considered surprising if its occurrence leads to a large change in the probability of the world model. We propose to compute this abstract measure of surprise by first modeling a corpus of video events using the Latent Dirichlet Allocation model. Subsequently, we measure the change in the Dirichlet prior of the LDA model as a result of each video event’s occurrence. This change of the Dirichlet prior leads to a closed form expression for an event’s level of surprise, which can then be inferred directly from the observed data. We tested our algorithm on a real dataset of video data, taken by a camera observing an urban street intersection. The results demonstrate our ability to detect atypical events, such as a car making a U-turn or a person crossing an intersection diagonally.

[1]  Yael Pritch,et al.  This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2008 1 Non-Chronological Video , 2022 .

[2]  Jürgen Schmidhuber,et al.  Driven by Compression Progress: A Simple Principle Explains Essential Aspects of Subjective Beauty, Novelty, Surprise, Interestingness, Attention, Curiosity, Creativity, Art, Science, Music, Jokes , 2008, ABiALS.

[3]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine-mediated learning.

[4]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[5]  Yael Pritch,et al.  Clustered Synopsis of Surveillance Video , 2009, 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance.

[6]  Frank Dellaert,et al.  Bayesian surprise and landmark detection , 2009, 2009 IEEE International Conference on Robotics and Automation.

[7]  Pierre Baldi,et al.  A principled approach to detecting surprising events in video , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[8]  Alexei A. Efros,et al.  Discovering objects and their location in images , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[9]  Ivan Laptev,et al.  Local Descriptors for Spatio-temporal Recognition , 2004, SCVMA.

[10]  Shaogang Gong,et al.  A Markov Clustering Topic Model for mining behaviour in video , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[11]  Jintao Li,et al.  Hierarchical spatio-temporal context modeling for action recognition , 2009, CVPR.

[12]  Juan Carlos Niebles,et al.  Unsupervised Learning of Human Action Categories , 2006 .

[13]  Michal Irani,et al.  Detecting Irregularities in Images and in Video , 2005, ICCV.

[14]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[15]  Harry Shum,et al.  Background Cut , 2006, ECCV.

[16]  Ronald G. Hughes,et al.  EVALUATION OF AUTOMATED PEDESTRIAN DETECTION AT SIGNALIZED INTERSECTIONS , 2000 .

[17]  W. Eric L. Grimson,et al.  Unsupervised Activity Perception by Hierarchical Bayesian Models , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Analysis , 1999, UAI.