Statistical Skimming of Feature Films

We present a statistical framework based on Hidden Markov Models (HMMs) for skimming feature films. A chain of HMMs is used to model subsequent story units: HMM states represent different visual-concepts, transitions model the temporal dependencies in each story unit, and stochastic observations are given by single shots. The skim is generated as an observation sequence, where, in order to privilege more informative segments for entering the skim, shots are assigned higher probability of observation if endowed with salient features related to specific film genres. The effectiveness of the method is demonstrated by skimming the first thirty minutes of a wide set of action and dramatic movies, in order to create previews for users useful for assessing whether they would like to see that movie or not, but without revealing the movie central part and plot details. Results are evaluated and compared through extensive user tests in terms of metrics that estimate the content representational value of the obtained video skims and their utility for assessing the user's interest in the observed movie.

[1]  Shih-Fu Chang,et al.  Structure analysis of soccer video with hidden Markov models , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[3]  Wei-Lun Chang,et al.  Aesthetics-Based Automatic Home Video Skimming System , 2008, MMM.

[4]  Michael A. Smith,et al.  Video skimming and characterization through the combination of image and language understanding techniques , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[5]  Alan Hanjalic,et al.  An integrated scheme for automated video abstraction based on unsupervised cluster-validity analysis , 1999, IEEE Trans. Circuits Syst. Video Technol..

[6]  Durbin,et al.  Biological Sequence Analysis , 1998 .

[7]  Pierangelo Migliorati,et al.  Hierarchical structuring of video previews by Leading-Cluster-Analysis , 2010, Signal Image Video Process..

[8]  Noboru Babaguchi,et al.  Video Summarization for Large Sports Video Archives , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[9]  Jeho Nam,et al.  Video abstract of video , 1999, 1999 IEEE Third Workshop on Multimedia Signal Processing (Cat. No.99TH8451).

[10]  Jianping Fan,et al.  Exploring video content structure for hierarchical summarization , 2004, Multimedia Systems.

[11]  Xin Liu,et al.  Video summarization using singular value decomposition , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[12]  Dragutin Petkovic,et al.  Using audio time scale modification for video browsing , 2000, Proceedings of the 33rd Annual Hawaii International Conference on System Sciences.

[13]  Jenny Benois-Pineau,et al.  Clustering of scene repeats for essential rushes preview , 2009, 2009 10th Workshop on Image Analysis for Multimedia Interactive Services.

[14]  Qian Huang,et al.  Automated generation of news content hierarchy by integrating audio, video, and text information , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[15]  Shih-Fu Chang,et al.  A utility framework for the automatic generation of audio-visual skims , 2002, MULTIMEDIA '02.

[16]  Boon-Lock Yeo,et al.  Time-constrained clustering for segmentation of video into story units , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[17]  Anoop Gupta,et al.  Time-compression: systems concerns, usage, and benefits , 1999, CHI '99.

[18]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[19]  Zhu Liu,et al.  Multimedia content analysis-using both audio and visual clues , 2000, IEEE Signal Process. Mag..

[20]  Ba Tu Truong,et al.  Video abstraction: A systematic review and classification , 2007, TOMCCAP.

[21]  Alan Hanjalic,et al.  Automated high-level movie segmentation for advanced video-retrieval systems , 1999, IEEE Trans. Circuits Syst. Video Technol..

[22]  Paul Over,et al.  Evaluation campaigns and TRECVid , 2006, MIR '06.

[23]  HongJiang Zhang,et al.  A model of motion attention for video skimming , 2002, Proceedings. International Conference on Image Processing.

[24]  Yue Gao,et al.  Dynamic video summarization using two-level redundancy detection , 2009, Multimedia Tools and Applications.

[25]  Petros Maragos,et al.  Video event detection and summarization using audio, visual and text saliency , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[26]  Ajay Divakaran,et al.  MPEG-7 visual motion descriptors , 2001, IEEE Trans. Circuits Syst. Video Technol..

[27]  Aggelos K. Katsaggelos,et al.  MINMAX optimal video summarization , 2005, IEEE Transactions on Circuits and Systems for Video Technology.

[28]  Sean R. Eddy,et al.  Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids , 1998 .

[29]  Hans Weda,et al.  Automated summarization of narrative video on a semantic level , 2007 .

[30]  Frederic P. Miller,et al.  Internet Movie Database , 2009 .

[31]  Riccardo Leonardi,et al.  Video Shot Clustering and Summarization through dendrograms , 2006 .

[32]  Chong-Wah Ngo,et al.  Video summarization and scene detection by graph modeling , 2005, IEEE Transactions on Circuits and Systems for Video Technology.