Video Mining

Video Mining is an essential reference for the practitioners and academicians in the fields of multimedia search engines. Half a terabyte or 9,000 hours of motion pictures are produced around the world every year. Furthermore, 3,000 television stations broadcasting for twenty-four hours a day produce eight million hours per year, amounting to 24,000 terabytes of data. Although some of the data is labeled at the time of production, an enormous portion remains unindexed. For practical access to such huge amounts of data, there is a great need to develop efficient tools for browsing and retrieving content of interest, so that producers and end users can quickly locate specific video sequences in this ocean of audio-visual data. Video Mining is important because it describes the main techniques being developed by the major players in industry and academic research to address this problem. It is the first time research from these leaders in the field developing the next-generation multimedia search engines is being described in great detail and gathered into a single volume. Video Mining will give valuable insights to all researchers and non-specialists who want to understand the principles applied by the multimedia search engines that are about to be deployed on the Internet, in studios' multimedia asset management systems, and in video-on-demand systems.

[1]  William M. Campbell,et al.  Support vector machines for speaker verification and identification , 2000, Neural Networks for Signal Processing X. Proceedings of the 2000 IEEE Signal Processing Society Workshop (Cat. No.00TH8501).

[2]  Andrew P. Witkin,et al.  Scale-space filtering: A new approach to multi-scale description , 1984, ICASSP.

[3]  David Doermann,et al.  Text enhancement in digital video , 1999, Electronic Imaging.

[4]  John R. Kender,et al.  Video Summaries through Mosaic-Based Shot and Scene Clustering , 2002, ECCV.

[5]  Michael A. Smith,et al.  Video skimming and characterization through the combination of image and language understanding techniques , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  Karel Reisz,et al.  The Technique of Film Editing. Enlarged Edition. , 1971 .

[7]  Boon-Lock Yeo,et al.  Segmentation of Video by Clustering and Graph Analysis , 1998, Comput. Vis. Image Underst..

[8]  Susan T. Dumais,et al.  Improving the retrieval of information from external sources , 1991 .

[9]  Azriel Rosenfeld,et al.  Relevance Ranking of Video Data using Hidden Markov Model Distances and Polygon Simplification , 2000, VISUAL.

[10]  Ellen K. Hughes,et al.  Video OCR for Digital News Archives , 1998 .

[11]  David S. Doermann,et al.  Superresolution-based enhancement of text in digital video , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[12]  Anil K. Jain,et al.  Image classification for content-based indexing , 2001, IEEE Trans. Image Process..

[13]  Jeffrey C. Reynar Statistical Models for Topic Segmentation , 1999, ACL.

[14]  Edward J. Delp,et al.  Automated video summarization using speech transcripts , 2001, IS&T/SPIE Electronic Imaging.

[15]  Cheng Lu,et al.  Classification of summarized videos using hidden markov models on compressed chromaticity signatures , 2001, MULTIMEDIA '01.

[16]  Boon-Lock Yeo,et al.  Video content characterization and compaction for digital library applications , 1997, Electronic Imaging.

[17]  Michael R. Lyu,et al.  A new approach for video text detection , 2002, Proceedings. International Conference on Image Processing.

[18]  Christos Faloutsos,et al.  VideoTrails: representing and visualizing structure in video sequences , 1997, MULTIMEDIA '97.

[19]  Steven R. Waterhouse,et al.  Classification and Regression using Mixtures of Experts , 1997 .

[20]  Hang Joon Kim,et al.  Neural network-based text location for news video indexing , 1999, Proceedings 1999 International Conference on Image Processing (Cat. 99CH36348).

[21]  Audrey Joan Reynertson The work of the film director , 1970 .

[22]  Takeo Kanade,et al.  Rotation Invariant Neural Network-Based Face Detection , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[23]  P. Venkat Rangan,et al.  Handbook of Multimedia Information Management , 1997 .

[24]  Rainer Lienhart,et al.  Localizing and segmenting text in images and videos , 2002, IEEE Trans. Circuits Syst. Video Technol..

[25]  Milind R. Naphade,et al.  A probabilistic framework for semantic video indexing, filtering, and retrieval , 2001, IEEE Trans. Multim..

[26]  Chitra Dorai,et al.  Automatic text extraction from video for content-based annotation and retrieval , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[27]  D. Arijon,et al.  Grammar of Film Language , 1976 .

[28]  John S. Boreczky,et al.  A hidden Markov model framework for video segmentation using audio and image features , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[29]  Wolfgang Effelsberg,et al.  Automatic text segmentation and text recognition for video indexing , 2000, Multimedia Systems.

[30]  Rainer Lienhart,et al.  Automatic text recognition in digital videos , 1995, Electronic Imaging.

[31]  John R. Smith,et al.  Learning to annotate video databases , 2001, IS&T/SPIE Electronic Imaging.

[32]  R. Lyon Speech recognition in scale space , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[33]  Dragutin Petkovic,et al.  "What is in that Video Anyway?" In Search of Better Browsing , 1999, ICMCS, Vol. 1.

[34]  John R. Smith,et al.  Interactive content-based retrieval of video , 2002, Proceedings. International Conference on Image Processing.

[35]  Shih-Ping Liou,et al.  Videoabstract: a hybrid approach to generate semantically meaningful video summaries , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[36]  Jitendra Malik,et al.  Scale-Space and Edge Detection Using Anisotropic Diffusion , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[37]  Malcolm Slaney,et al.  Multimedia edges: finding hierarchy in all dimensions , 2001, MULTIMEDIA '01.

[38]  Zhu Liu,et al.  Integration of audio and visual information for content-based video segmentation , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[39]  B. S. Manjunath,et al.  Content-based search of video using color, texture, and motion , 1997, Proceedings of International Conference on Image Processing.

[40]  Anil K. Jain,et al.  Automatic caption localization in compressed video , 1999, Proceedings 1999 International Conference on Image Processing (Cat. 99CH36348).

[41]  Daniel P. Lopresti,et al.  Locating and Recognizing Text in WWW Images , 2000, Information Retrieval.

[42]  Sebastian Thrun,et al.  Learning to Classify Text from Labeled and Unlabeled Documents , 1998, AAAI/IAAI.

[43]  Shih-Fu Chang,et al.  A fully automated content-based video search engine supporting spatiotemporal queries , 1998, IEEE Trans. Circuits Syst. Video Technol..

[44]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[45]  John R. Smith,et al.  Video summarization and personalization for pervasive mobile devices , 2001, IS&T/SPIE Electronic Imaging.

[46]  John R. Kender,et al.  Finding skin in color images , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[47]  John R. Smith,et al.  Interactive search fusion methods for video database retrieval , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[48]  Malcolm Slaney,et al.  Mixtures of probability experts for audio retrieval and indexing , 2002, Proceedings. IEEE International Conference on Multimedia and Expo.

[49]  Chafic Mokbel,et al.  Online adaptation of HMMs to real-life conditions: a unified framework , 2001, IEEE Trans. Speech Audio Process..

[50]  Shinji Ozawa,et al.  A method for content-based similarity retrieval of images using two dimensional DP matching algorithm , 2001, Proceedings 11th International Conference on Image Analysis and Processing.

[51]  David A. Forsyth,et al.  Learning the semantics of words and pictures , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[52]  Majid Mirmehdi,et al.  Finding Text Regions Using Localised Measures , 2000 .

[53]  Shigeru Akamatsu,et al.  Recognizing Characters in Scene Images , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[54]  Douglas A. Reynolds,et al.  Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..

[55]  C.-C. Jay Kuo,et al.  Audio content analysis for online audiovisual data segmentation and classification , 2001, IEEE Trans. Speech Audio Process..

[56]  Boon-Lock Yeo,et al.  Extracting story units from long programs for video browsing and navigation , 1996, Proceedings of the Third IEEE International Conference on Multimedia Computing and Systems.

[57]  Boon-Lock Yeo,et al.  Visual content highlighting via automatic extraction of embedded captions on MPEG compressed video , 1996, Electronic Imaging.

[58]  Rangachar Kasturi,et al.  Locating uniform-colored text in video frames , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[59]  N. L. Johnson,et al.  Multivariate Analysis , 1958, Nature.

[60]  P. C. Wong,et al.  TOPIC ISLANDS/sup TM/-a wavelet-based text visualization system , 1998, Proceedings Visualization '98 (Cat. No.98CB36276).

[61]  Takeo Kanade,et al.  Video OCR: indexing digital news libraries by recognition of superimposed captions , 1999, Multimedia Systems.

[62]  Alan Hanjalic,et al.  Automated high-level movie segmentation for advanced video-retrieval systems , 1999, IEEE Trans. Circuits Syst. Video Technol..

[63]  Rainer Lienhart,et al.  An extended set of Haar-like features for rapid object detection , 2002, Proceedings. International Conference on Image Processing.

[64]  Anil K. Jain,et al.  Automatic text location in images and video frames , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[65]  Sanguklee,et al.  A comparative performance study of several global thresholding techniques for segmentation , 1990 .

[66]  Shih-Fu Chang,et al.  Determining computable scenes in films and their structures using audio-visual memory models , 2000, ACM Multimedia.

[67]  Yee Leung,et al.  Clustering by Scale-Space Filtering , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[68]  Wayne H. Wolf,et al.  Hidden Markov model parsing of video programs , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[69]  M. Ibrahim Sezan,et al.  A semantic event-detection approach and its application to detecting hunts in wildlife vide , 2000, IEEE Trans. Circuits Syst. Video Technol..

[70]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[71]  John R. Smith,et al.  VideoZoom Spatio-Temporal Video Browser , 1999, IEEE Trans. Multim..

[72]  Kôiti Hasida,et al.  Semantics of multimedia in MPEG-7 , 2002, Proceedings. International Conference on Image Processing.

[73]  Gang Wei,et al.  Video classification based on HMM using text and faces , 2000, 2000 10th European Signal Processing Conference.

[74]  David S. Doermann,et al.  Automatic text detection and tracking in digital video , 2000, IEEE Trans. Image Process..

[75]  Wolfgang Effelsberg,et al.  Abstracting Digital Movies Automatically , 1996, J. Vis. Commun. Image Represent..

[76]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[77]  Nuno Vasconcelos,et al.  Towards semantically meaningful feature spaces for the characterization of video content , 1997, Proceedings of International Conference on Image Processing.

[78]  Haim Schweitzer,et al.  Template matching approach to content based image indexing by low dimensional Euclidean embedding , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[79]  Thomas S. Huang,et al.  Constructing table-of-content for videos , 1999, Multimedia Systems.

[80]  Andrei Tarkovsky,et al.  Sculpting in Time , 1985 .

[81]  F ChenStanley,et al.  An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[82]  Paolo Toth,et al.  Knapsack Problems: Algorithms and Computer Implementations , 1990 .

[83]  Ioannis Pitas,et al.  Content-based video parsing and indexing based on audio-visual interaction , 2001, IEEE Trans. Circuits Syst. Video Technol..

[84]  Amarnath Gupta,et al.  Virage video engine , 1997, Electronic Imaging.

[85]  Rainer Lienhart,et al.  Automatic text recognition for video indexing , 1997, MULTIMEDIA '96.

[86]  Ying Li,et al.  Identification of speakers in movie dialogs using audiovisual cues , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.