Automatic TV program genre classification based on audio patterns

We discuss the automatic classification of TV program genre based on audio patterns. The audio patterns are defined as a set of relative probabilities for a set of mid-level audio categories. First, we describe the extraction of these audio patterns. Second, we discuss how to use these audio patterns for genre classification. Our genre classification differs from current methods used for TV programs in that it does not require the use of an electronic program guide, such as in personal video recorders. Electronic program guides use simple text based information about genre for whole programs. In contrast, we can determine genre information at the level of program segments. This can be important, for example, for TV program rating which allows to deal selectively with program sections. We demonstrate our method on a set of 7 different TV news and talk shows. The experimental results show that the audio patterns for news and talk show that are consistent with the general structure of these programs.

[1]  Milind R. Naphade,et al.  A probabilistic framework for semantic indexing and retrieval in video , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[2]  Mark Pawlewski,et al.  Video genre classification using dynamics , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[3]  John Zimmerman,et al.  Video scouting: an architecture and system for the integration of multimedia information in personal TV applications , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[4]  John Zimmerman,et al.  Integrated multimedia processing for topic segmentation and classification , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[5]  Nuno Vasconcelos,et al.  Bayesian representations and learning mechanisms for content-based image retrieval , 1999, Electronic Imaging.

[6]  Ishwar K. Sethi,et al.  Classification of general audio data for content-based retrieval , 2001, Pattern Recognit. Lett..

[7]  Stefan M. Rüger,et al.  From Raw Polyphonic Audio to Locating Recurring Themes , 2000, ISMIR.