Automatic categorization and summarization of documentaries

In this paper, we propose automatic categorization and summarization of documentaries using subtitles of videos. We propose two methods for video categorization. The first makes unsupervised categorization by applying natural language processing techniques on video subtitles and uses the WordNet lexical database and WordNet domains. The second has the same extraction steps but uses a learning module to categorize. Experiments with documentary videos give promising results in discovering the correct categories of videos. We also propose a video summarization method using the subtitles of videos and text summarization techniques. Significant sentences in the subtitles of a video are identified using these techniques and a video summary is then composed by finding the video parts corresponding to these summary sentences.

[1]  Ioannis Pitas,et al.  Information theory-based shot cut/fade detection and video summarization , 2006, IEEE Transactions on Circuits and Systems for Video Technology.

[2]  Francine Chen,et al.  A trainable document summarizer , 1995, SIGIR '95.

[3]  Chong-Wah Ngo,et al.  Video summarization and scene detection by graph modeling , 2005, IEEE Transactions on Circuits and Systems for Video Technology.

[4]  Diane J. Cook,et al.  Automatic Video Classification: A Survey of the Literature , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[5]  Dragomir R. Radev,et al.  LexRank: Graph-based Lexical Centrality as Salience in Text Summarization , 2004, J. Artif. Intell. Res..

[6]  David S. Doermann,et al.  Video summarization by curve simplification , 1998, MULTIMEDIA '98.

[7]  John Dunnion,et al.  Assessing the Impact of Lexical Chain Scoring Methods and Sentence Extraction Schemes on Summarization , 2004, CICLing.

[8]  Mark Hepple,et al.  Independence and Commitment: Assumptions for Rapid Training and Execution of Rule-based POS Taggers , 2000, ACL.

[9]  Anoop Gupta,et al.  Automatically extracting highlights for TV Baseball programs , 2000, ACM Multimedia.

[10]  Kathleen McKeown,et al.  Improving Word Sense Disambiguation in Lexical Chaining , 2003, IJCAI.

[11]  Hans Weda,et al.  Automated summarization of narrative video on a semantic level , 2007, International Conference on Semantic Computing (ICSC 2007).

[12]  Hao Jiang,et al.  Integrating visual, audio and text analysis for news video , 2000, Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101).

[13]  Regina Barzilay,et al.  Using Lexical Chains for Text Summarization , 1997 .

[14]  Lawrence Wai-Choong Wong,et al.  ANSES: Summarisation of News Video , 2003, CIVR.

[15]  Simone Teufel,et al.  Sentence extraction as a classification task , 1997 .

[16]  Kathleen F. McCoy,et al.  Efficient text summarization using lexical chains , 2000, IUI '00.

[17]  Yllias Chali,et al.  Text Summarization Using Lexical Chains , 2001 .

[18]  Koichiro Honda,et al.  Automatic video summarization by using color and utterance information , 2002, Proceedings. IEEE International Conference on Multimedia and Expo.

[19]  Nevenka Dimitrova,et al.  Movie-in-a-Minute: Automatically Generated Video Previews , 2004, PCM.

[20]  Ilyas Cicekli,et al.  Lexical Cohesion Based Topic Modeling for Summarization , 2008, CICLing.

[21]  Stathes Hadjiefthymiades,et al.  Semantic Video Classification Based on Subtitles and Domain Terminologies , 2007, KAMC.

[22]  A. Murat Tekalp,et al.  Automatic soccer video analysis and summarization , 2003, IEEE Trans. Image Process..

[23]  Peng Wang,et al.  A hybrid approach to news video classification multimodal features , 2003, Fourth International Conference on Information, Communications and Signal Processing, 2003 and the Fourth Pacific Rim Conference on Multimedia. Proceedings of the 2003 Joint.

[24]  Harry W. Agius,et al.  Video summarisation: A conceptual framework and survey of the state of the art , 2008, J. Vis. Commun. Image Represent..

[25]  Eduard H. Hovy,et al.  Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics , 2003, NAACL.

[26]  Tao Mei,et al.  Automatic Video Genre Categorization using Hierarchical SVM , 2006, 2006 International Conference on Image Processing.

[27]  Huaiqing Wang,et al.  Intelligent Agent Supported Flexible Workflow Monitoring System , 2002, CAiSE.

[28]  Weiyu Zhu,et al.  Automatic news video segmentation and categorization based on closed-captioned text , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..

[29]  Ted Pedersen,et al.  An Adapted Lesk Algorithm for Word Sense Disambiguation Using WordNet , 2002, CICLing.

[30]  Dragomir R. Radev,et al.  Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies , 2000, ArXiv.

[31]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.

[32]  C. Jaruskulchai,et al.  Flashlight and player detection in fighting sport for video summarization , 2005, IEEE International Symposium on Communications and Information Technology, 2005. ISCIT 2005..

[33]  Andreas Girgensohn,et al.  A fast layout algorithm for visual video summaries , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[34]  Emanuele Pianta,et al.  Revising the Wordnet Domains Hierarchy: semantics, coverage and balancing , 2004 .

[35]  NgoChong-Wah,et al.  Video summarization and scene detection by graph modeling , 2005 .

[36]  Raimondo Schettini,et al.  Dynamic storyboards for video content summarization , 2006, MIR '06.

[37]  Francisco Nivando Bezerra,et al.  Low cost soccer video summaries based on visual rhythm , 2006, MIR '06.

[38]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.