Automatically extracting highlights for TV Baseball programs

In today's fast-paced world, while the number of channels of television programming available is increasing rapidly, the time available to watch them remains the same or is decreasing. Users desire the capability to watch the programs time-shifted (on-demand) and/or to watch just the highlights to save time. In this paper we explore how to provide for the latter capability, that is the ability to extract highlights automatically, so that viewing time can be reduced. We focus on the sport of baseball as our initial target—it is a very popular sport, the whole game is quite long, and the exciting portions are few. We focus on detecting highlights using audio-track features alone without relying on expensive-to-compute video-track features. We use a combination of generic sports features and baseball-specific features to obtain our results, but believe that may other sports offer the same opportunity and that the techniques presented here will apply to those sports. We present details on relative performance of various learning algorithms, and a probabilistic framework for combining multiple sources of information. We present results comparing output of our algorithms against human-selected highlights for a diverse collection of baseball games with very encouraging results.

[1]  H. V. Gelder The Netherlands , 2004, Constitutions of Europe (2 vols.).

[2]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[3]  John P. Oakley,et al.  Storage and Retrieval for Image and Video Databases , 1993 .

[4]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[5]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[6]  Yihong Gong,et al.  Automatic parsing of news video , 1994, 1994 Proceedings of IEEE International Conference on Multimedia Computing and Systems.

[7]  Barry Arons,et al.  Pitch-based emphasis detection for segmenting speech recordings , 1994, ICSLP.

[8]  HongJiang Zhang,et al.  Automatic parsing of TV soccer programs , 1995, Proceedings of the International Conference on Multimedia Computing and Systems.

[9]  Boon-Lock Yeo,et al.  Extracting story units from long programs for video browsing and navigation , 1996, Proceedings of the Third IEEE International Conference on Multimedia Computing and Systems.

[10]  Frank Dellaert,et al.  Recognizing emotion in speech , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[11]  Wenjun Zeng,et al.  Integrated image and speech analysis for content-based video indexing , 1996, Proceedings of the Third IEEE International Conference on Multimedia Computing and Systems.

[12]  Fernando Pereira,et al.  MPEG-4: Context and objectives , 1997, Signal Process. Image Commun..

[13]  Christos Faloutsos,et al.  MindReader: Querying Databases Through Multiple Examples , 1998, VLDB.

[14]  J. C. BurgesChristopher A Tutorial on Support Vector Machines for Pattern Recognition , 1998 .

[15]  Tsuhan Chen,et al.  Audio Feature Extraction and Analysis for Scene Segmentation and Classification , 1998, J. VLSI Signal Process..

[16]  Anoop Gupta,et al.  Auto-summarization of audio-video presentations , 1999, MULTIMEDIA '99.

[17]  Yihong Gong,et al.  Lessons Learned from Building a Terabyte Digital Video Library , 1999, Computer.

[18]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[19]  Thomas S. Huang,et al.  Constructing table-of-content for videos , 1999, Multimedia Systems.

[20]  Dragutin Petkovic,et al.  Towards robust features for classifying audio in the CueVideo system , 1999, MULTIMEDIA '99.

[21]  C.-C. Jay Kuo,et al.  Heuristic approach for generic audio data segmentation and annotation , 1999, MULTIMEDIA '99.

[22]  Chung-Ho Yang,et al.  A novel approach to robust speech endpoint detection in car environments , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[23]  Anoop Gupta,et al.  Browsing digital video , 2000, CHI.