Adaptive Keyframe Selection for Video Summarization

The explosive growth of video data in the modern era has set the stage for research in the field of video summarization, which attempts to abstract the salient frames in a video in order to provide an easily interpreted synopsis. Existing work on video summarization has primarily been static - that is, the algorithms require the summary length to be specified as an input parameter. However, video streams are inherently dynamic in nature, while some of them are relatively simple in terms of visual content, others are much more complex due to camera/object motion, changing illumination, cluttered scenes and low quality. This necessitates the development of adaptive summarization techniques, which adapt to the complexity of a video and generate a summary accordingly. In this paper, we propose a novel algorithm to address this problem. We pose the summary selection as an optimization problem and derive an efficient technique to solve the summary length and the specific frames to be selected, through a single formulation. Our extensive empirical studies on a wide range of challenging, unconstrained videos demonstrate tremendous promise in using this method for real-world video summarization applications.

[1]  Andreas Krause,et al.  Near-optimal Nonmyopic Value of Information in Graphical Models , 2005, UAI.

[2]  M. L. Fisher,et al.  An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[3]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[4]  Stephen W. Smoliar,et al.  An integrated system for content-based video retrieval and browsing , 1997, Pattern Recognit..

[5]  Chong-Wah Ngo,et al.  Automatic video summarization by graph modeling , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[6]  Eric Horvitz,et al.  Selective Supervision: Guiding Supervised Learning with Decision-Theoretic Active Learning , 2007, IJCAI.

[7]  Chih-Jen Lin,et al.  Large-Scale Video Summarization Using Web-Image Priors , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  A. Murat Tekalp,et al.  Multiscale content extraction and representation for video indexing , 1997, Other Conferences.

[9]  Raimondo Schettini,et al.  Erratum to: An innovative algorithm for key frame extraction in video summarization , 2006, Journal of Real-Time Image Processing.

[10]  Regunathan Radhakrishnan,et al.  Video summarization using descriptors of motion activity: A motion activity based approach to key-frame extraction from video shots , 2001, J. Electronic Imaging.

[11]  Kristen Grauman,et al.  Story-Driven Summarization for Egocentric Video , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Rainer Lienhart Dynamic video summarization of home video , 1999, Electronic Imaging.

[13]  G. Nemhauser,et al.  On the Uncapacitated Location Problem , 1977 .

[14]  Daniel P. Huttenlocher,et al.  Comparing Images Using the Hausdorff Distance , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Yong Jae Lee,et al.  Discovering important people and objects for egocentric video summarization , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Ba Tu Truong,et al.  Video abstraction: A systematic review and classification , 2007, TOMCCAP.

[17]  Jian Su,et al.  Multi-Criteria-based Active Learning for Named Entity Recognition , 2004, ACL.

[18]  Stephen W. Smoliar,et al.  Video parsing, retrieval and browsing: an integrated and content-based solution , 1997, MULTIMEDIA '95.

[19]  A. Murat Tekalp,et al.  Hierarchical temporal video segmentation and content characterization , 1997, Other Conferences.

[20]  Yuzhuo Zhong,et al.  A method of keyframe setting in video coding: fast adaptive dynamic keyframe selecting , 2003, 2003 International Conference on Computer Networks and Mobile Computing, 2003. ICCNMC 2003..

[21]  Shih-Fu Chang,et al.  Real-time personalized sports video filtering and summarization , 2001, MULTIMEDIA '01.

[22]  Mubarak Shah,et al.  UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild , 2012, ArXiv.

[23]  Yueting Zhuang,et al.  Adaptive key frame extraction using unsupervised clustering , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[24]  Hussein M. Abdel-Wahab,et al.  Adaptive Key Frames Selection Algorithms for Summarizing Video Data , 2002, JCIS.

[25]  Yue Gao,et al.  Dynamic video summarization using two-level redundancy detection , 2009, Multimedia Tools and Applications.

[26]  Joseph Naor,et al.  A Tight Linear Time (1/2)-Approximation for Unconstrained Submodular Maximization , 2015, SIAM J. Comput..

[27]  Rama Chellappa,et al.  Video Précis: Highlighting Diverse Aspects of Videos , 2010, IEEE Transactions on Multimedia.