Video summarisation: A conceptual framework and survey of the state of the art

Video summaries provide condensed and succinct representations of the content of a video stream through a combination of still images, video segments, graphical representations and textual descriptors. This paper presents a conceptual framework for video summarisation derived from the research literature and used as a means for surveying the research literature. The framework distinguishes between video summarisation techniques (the methods used to process content from a source video stream to achieve a summarisation of that stream) and video summaries (outputs of video summarisation techniques). Video summarisation techniques are considered within three broad categories: internal (analyse information sourced directly from the video stream), external (analyse information not sourced directly from the video stream) and hybrid (analyse a combination of internal and external information). Video summaries are considered as a function of the type of content they are derived from (object, event, perception or feature based) and the functionality offered to the user for their consumption (interactive or static, personalised or generic). It is argued that video summarisation would benefit from greater incorporation of external information, particularly user based information that is unobtrusively sourced, in order to overcome longstanding challenges such as the semantic gap and providing video summaries that have greater relevance to individual users.

[1]  Chia-Hung Yeh,et al.  Techniques for movie content analysis and skimming: tutorial and overview on video abstraction techniques , 2006, IEEE Signal Processing Magazine.

[2]  Xiaoou Tang,et al.  Video caption detection and extraction using temporal information , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[3]  Nicu Sebe,et al.  Content-based multimedia information retrieval: State of the art and challenges , 2006, TOMCCAP.

[4]  Shih-Fu Chang,et al.  Video Adaptation: Concepts, Technologies, and Open Issues , 2005, Proceedings of the IEEE.

[5]  Junehwa Song,et al.  Narrative abstraction model for story-oriented video , 2004, MULTIMEDIA '04.

[6]  Wolfgang Effelsberg,et al.  Video abstracting , 1997, CACM.

[7]  Kiyoharu Aizawa,et al.  Summarizing wearable video , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[8]  Rainer Lienhart Dynamic video summarization of home video , 1999, Electronic Imaging.

[9]  Wolfgang Effelsberg,et al.  Automatic generation of video summaries for historical films , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[10]  Alan Hanjalic,et al.  Adaptive extraction of highlights from a sport video based on excitement modeling , 2005, IEEE Transactions on Multimedia.

[11]  De Xu,et al.  An approach to generating two-level video abstraction , 2003, Proceedings of the 2003 International Conference on Machine Learning and Cybernetics (IEEE Cat. No.03EX693).

[12]  A. Murat Tekalp,et al.  Two-stage hierarchical video summary extraction to match low-level user browsing preferences , 2003, IEEE Trans. Multim..

[13]  John Zimmerman,et al.  Framework for personalized multimedia summarization , 2005, MIR '05.

[14]  Xindong Wu,et al.  Sequential association mining for video summarization , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[15]  Noboru Babaguchi,et al.  Personalized abstraction of broadcasted American football video by highlight selection , 2004, IEEE Transactions on Multimedia.

[16]  John R. Smith,et al.  Using MPEG-7 and MPEG-21 for personalizing video , 2004, IEEE MultiMedia.

[17]  Shih-Fu Chang,et al.  Event detection in baseball video using superimposed caption recognition , 2002, MULTIMEDIA '02.

[18]  Simon King,et al.  From context to content: leveraging context to infer media metadata , 2004, MULTIMEDIA '04.

[19]  Ioannis Pitas,et al.  Information theory-based shot cut/fade detection and video summarization , 2006, IEEE Transactions on Circuits and Systems for Video Technology.

[20]  Ling Guan,et al.  Semantic Retrieval of Multimedia , 2006 .

[21]  Ching-Yung Lin,et al.  Personalized video summary using visual semantic annotations and automatic speech transcriptions , 2002, 2002 IEEE Workshop on Multimedia Signal Processing..

[22]  Chong-Wah Ngo,et al.  Video summarization and scene detection by graph modeling , 2005, IEEE Transactions on Circuits and Systems for Video Technology.

[23]  John Zimmerman,et al.  Interface Design for MyInfo: a Personal News Demonstrator Combining Web and TV Content , 2003, INTERACT.

[24]  Kiyoharu Aizawa,et al.  Evaluation of video summarization for a large number of cameras in ubiquitous home , 2005, MULTIMEDIA '05.

[25]  Mohan S. Kankanhalli,et al.  Creating audio keywords for event detection in soccer video , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[26]  Thomas S. Huang,et al.  Efficient access to video content in a unified framework , 1999, Proceedings IEEE International Conference on Multimedia Computing and Systems.

[27]  John R. Smith,et al.  Hierarchical video summarization based on context clustering , 2003, SPIE ITCom.

[28]  Masaharu Ogawa,et al.  A highlight scene detection and video summarization system using audio feature for a personal video recorder , 2005, IEEE Transactions on Consumer Electronics.

[29]  Mauro Barbieri,et al.  Video summarization: methods and landscape , 2003, SPIE ITCom.

[30]  Changsheng Xu,et al.  Live sports event detection based on broadcast video and web-casting text , 2006, MM '06.

[31]  Juan Camilo Pinzon DESIGNING AN EXPERIENTIAL ANNOTATION SYSTEM FOR PERSONAL MULTIMEDIA INFORMATION MANAGEMENT , 2005 .

[32]  Nevenka Dimitrova Context and Memory in Multimedia Content Analysis , 2004, IEEE Multim..

[33]  Michael A. Smith,et al.  Video skimming and characterization through the combination of image and language understanding techniques , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[34]  Francisco Nivando Bezerra,et al.  Low cost soccer video summaries based on visual rhythm , 2006, MIR '06.

[35]  Phil Cheatle Media content and type selection from always-on wearable video , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[36]  Lie Lu,et al.  Highlight sound effects detection in audio stream , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[37]  Fumiko Satoh,et al.  Learning personalized video highlights from detailed MPEG-7 metadata , 2002, Proceedings. International Conference on Image Processing.

[38]  Hang-Bong Kang,et al.  Affective content detection using HMMs , 2003, ACM Multimedia.

[39]  Ian Dey,et al.  Qualitative Data Analysis: A User Friendly Guide for Social Scientists , 1993 .

[40]  Yi-Ping Phoebe Chen,et al.  Highlights for more complete sports video summarization , 2004, IEEE MultiMedia.

[41]  Ching-Yung Lin,et al.  Optimizing user expectations for video semantic filtering and abstraction , 2005, 2005 IEEE International Symposium on Circuits and Systems.

[42]  Anoop Gupta,et al.  Automatically extracting highlights for TV Baseball programs , 2000, ACM Multimedia.

[43]  V. Ghini,et al.  An audio-video summarization scheme based on audio and video analysis , 2006, CCNC 2006. 2006 3rd IEEE Consumer Communications and Networking Conference, 2006..

[44]  Wei-Ta Chu,et al.  Semantic units detection and summarization of baseball videos , 2004, The 2004 47th Midwest Symposium on Circuits and Systems, 2004. MWSCAS '04..

[45]  Noboru Babaguchi,et al.  Generation of personalized abstract of sports video , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..

[46]  Raimondo Schettini,et al.  Dynamic storyboards for video content summarization , 2006, MIR '06.

[47]  Michael R. Lyu,et al.  Video summarization by spatial-temporal graph optimization , 2004, 2004 IEEE International Symposium on Circuits and Systems (IEEE Cat. No.04CH37512).

[48]  Masaru Sugano,et al.  Automated MPEG audio-video summarization and description , 2002, Proceedings. International Conference on Image Processing.

[49]  A. Avogadro An Audio-Video Summarization Scheme Based on Audio and Video Analysis , 2005 .

[50]  Raimondo Schettini,et al.  Erratum to: An innovative algorithm for key frame extraction in video summarization , 2006, Journal of Real-Time Image Processing.

[51]  Mohan S. Kankanhalli,et al.  Automatically generating summaries for musical video , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[52]  Lucy Yardley,et al.  Content and thematic analysis , 2003 .

[53]  Alan Hanjalic,et al.  Affective video content representation and modeling , 2005, IEEE Transactions on Multimedia.

[54]  Shih-Fu Chang,et al.  The holy grail of content-based media analysis , 2002 .

[55]  Chung-Lin Huang,et al.  MSN: statistical understanding of broadcasted baseball video using multi-level semantic network , 2005, IEEE Transactions on Broadcasting.

[56]  Jianping Fan,et al.  Exploring video content structure for hierarchical summarization , 2004, Multimedia Systems.

[57]  John R. Kender,et al.  Augmented segmentation and visualization for presentation videos , 2005, MULTIMEDIA '05.

[58]  Takeo Kanade,et al.  Video skimming and characterization through the combination of image and language understanding , 1998, Proceedings 1998 IEEE International Workshop on Content-Based Access of Image and Video Database.

[59]  Alan Hanjalic,et al.  Generic approach to highlights extraction from a sport video , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[60]  Yi-Ping Phoebe Chen,et al.  Classification of self-consumable highlights for soccer video summaries , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[61]  C. Jaruskulchai,et al.  Flashlight and player detection in fighting sport for video summarization , 2005, IEEE International Symposium on Communications and Information Technology, 2005. ISCIT 2005..

[62]  Andreas Girgensohn,et al.  A fast layout algorithm for visual video summaries , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[63]  V. S. Subrahmanian,et al.  The priority curve algorithm for video summarization , 2004, MMDB '04.

[64]  Patrick Bouthemy,et al.  Tennis video abstraction from audio and visual cues , 2004, IEEE 6th Workshop on Multimedia Signal Processing, 2004..

[65]  Tao Mei,et al.  Spatio-temporal quality assessment for home videos , 2005, MULTIMEDIA '05.

[66]  I. Dey Qualitative Data Analysis: A User Friendly Guide for Social Scientists , 1993 .

[67]  Frank M. Shipman,et al.  Creating navigable multi-level video summaries , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[68]  Tsuyoshi Moriyama,et al.  Video summarization based on the psychological unfolding of a drama , 2002, Systems and Computers in Japan.

[69]  Noboru Babaguchi,et al.  Video Summarization for Large Sports Video Archives , 2005, 2005 IEEE International Conference on Multimedia and Expo.

[70]  Harry W. Agius,et al.  An empirical investigation into user navigation of digital video using the VCR-like control set , 2006, Int. J. Hum. Comput. Stud..

[71]  Aggelos K. Katsaggelos,et al.  MINMAX optimal video summarization , 2005, IEEE Transactions on Circuits and Systems for Video Technology.

[72]  W. Hays Introduction to Qualitative Research Methods: The Search for Meanings. 2nd ed. , 1985 .

[73]  Jaron Lanier The frontier between us , 1997, CACM.

[74]  Hyung-Myung Kim,et al.  Summarization of news video and its description for content‐based access , 2003, Int. J. Imaging Syst. Technol..

[75]  Shih-Fu Chang,et al.  A utility framework for the automatic generation of audio-visual skims , 2002, MULTIMEDIA '02.

[76]  Zhu Liu,et al.  Multimedia content analysis-using both audio and visual clues , 2000, IEEE Signal Process. Mag..

[77]  Lie Lu,et al.  A generic framework of user attention model and its application in video summarization , 2005, IEEE Trans. Multim..

[78]  Whoi-Yul Kim,et al.  Automatic video summarizing tool using MPEG-7 descriptors for personal video recorder , 2003, IEEE Trans. Consumer Electron..

[79]  A. Murat Tekalp,et al.  Automatic soccer video analysis and summarization , 2003, IEEE Trans. Image Process..

[80]  V. S. Subrahmanian,et al.  The CPR Model for Summarizing Video , 2003, MMDB '03.

[81]  M. Lorentzon Doing Qualitative Research , 1993 .

[82]  Mohan S. Kankanhalli,et al.  Semantic video summarization in compressed domain MPEG video , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[83]  Yue-Shi Lee,et al.  VSUM: summarizing from videos , 2004, IEEE Sixth International Symposium on Multimedia Software Engineering.

[84]  Kiyoharu Aizawa,et al.  Efficient retrieval of life log based on context and content , 2004, CARPE'04.

[85]  John R. Kender,et al.  Design and evaluation of a music video summarization system , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[86]  Tanveer F. Syeda-Mahmood,et al.  Learning video browsing behavior and its application in the generation of video previews , 2001, MULTIMEDIA '01.

[87]  L. Yardley,et al.  Research Methods for Clinical and Health Psychology , 2003 .

[88]  Grace Hui Yang,et al.  VideoQA: question answering on news video , 2003, MULTIMEDIA '03.

[89]  Wei-Ying Ma,et al.  Video summarization based on user log enhanced link analysis , 2003, ACM Multimedia.

[90]  Rainer Lienhart,et al.  Abstracting home video automatically , 1999, MULTIMEDIA '99.

[91]  Anoop Gupta,et al.  Auto-summarization of audio-video presentations , 1999, MULTIMEDIA '99.