Adaptive Video Fast Forward

We derive a statistical graphical model of video scenes with multiple, possibly occluded objects that can be efficiently used for tasks related to video search, browsing and retrieval. The model is trained on query (target) clip selected by the user. Shot retrieval process is based on the likelihood of a video frame under generative model. Instead of using a combination of weighted Euclidean distances as a shot similarity measure, the likelihood model automatically separates and balances various causes of variability in video, including occlusion, appearance change and motion. Thus, we overcome tedious and complex user interventions required in previous studies. We use the model in the adaptive video forward application that adapts video playback speed to the likelihood of the data. The similarity measure of each candidate clip to the target clip defines the playback speed. Given a query, the video is played at a higher speed as long as video content has low likelihood, and when frames similar to the query clip start to come in, the video playback rate drops. Set of experiments o12n typical home videos demonstrate performance, easiness and utility of our application.

[1]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[2]  Michal Irani,et al.  Video indexing based on mosaic representations , 1998, Proc. IEEE.

[3]  Alex Pentland,et al.  Pfinder: real-time tracking of the human body , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[4]  Shree K. Nayar,et al.  Spatial information in multiresolution histograms , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[5]  M. Irani,et al.  Event-Based Video Analysis, , 2001 .

[6]  Oded Maron,et al.  Multiple-Instance Learning for Natural Scene Classification , 1998, ICML.

[7]  Brendan J. Frey,et al.  Learning flexible sprites in video layers , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[8]  Cordelia Schmid,et al.  Constructing models for content-based image retrieval , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[9]  N. Jojic,et al.  Scene generative models for adaptive video fast forward , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[10]  Shih-Fu Chang,et al.  A fully automated content-based video search engine supporting spatiotemporal queries , 1998, IEEE Trans. Circuits Syst. Video Technol..

[11]  Walter R. Gilks,et al.  BUGS - Bayesian inference Using Gibbs Sampling Version 0.50 , 1995 .

[12]  Chong-Wah Ngo,et al.  On clustering and retrieval of video shots , 2001, MULTIMEDIA '01.

[13]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[14]  Daniel Kahneman,et al.  Probabilistic reasoning , 1993 .

[15]  Brendan J. Frey,et al.  Probabilistic multimedia objects (multijects): a novel approach to video indexing and retrieval in multimedia systems , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[16]  Brendan J. Frey,et al.  Transformed hidden Markov models: estimating mixture models of images and inferring spatial transformations in video sequences , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[17]  Paul A. Viola,et al.  Structure Driven Image Database Retrieval , 1997, NIPS.

[18]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[19]  Gopal Sarma Pingali,et al.  Multimedia retrieval through spatio-temporal activity maps , 2001, MULTIMEDIA '01.

[20]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine-mediated learning.

[21]  Paul A. Viola,et al.  Boosting Image Retrieval , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[22]  Michael J. Swain,et al.  Color indexing , 1991, International Journal of Computer Vision.

[23]  Chris Stauffer,et al.  Transform-invariant Image Decomposition with Similarity Templates , 2001, NIPS.

[24]  Lihi Zelnik-Manor,et al.  Event-based analysis of video , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.