Rhythm detection for speech-music discrimination in MPEG compressed domain

A novel approach to speech-music discrimination based on rhythm (or beat) detection is introduced. Rhythmic pulses are detected by applying a long-term autocorrelation method on band-passed signals. This approach is combined with another, in which the features describe the energy peaks of the signal. The discriminator uses just three features that are computed from data directly taken from an MPEG-1 bitstream. The discriminator was tested on more than 3 hours of audio data. Average recognition rate is 97.7%.

[1]  Alan F. Smeaton,et al.  Fischlar: an on-line system for indexing and browsing broadcast television content , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[2]  A. Flammini,et al.  AUDIO CLASSIFICATION IN SPEECH AND MUSIC : A COMPARISON OF DIFFERENT APPROACHES , 2001 .

[3]  Yang Lu,et al.  A fast audio classification from MPEG coded data , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[4]  C.-C. Jay Kuo,et al.  Hierarchical classification of audio data for archiving and retrieving , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[5]  Jani Penttilä,et al.  A SPEECH/MUSIC DISCRIMINATOR -BASED AUDIO BROWSER WITH A DEGREE OF CERTAINTY MEASURE , 2001 .

[6]  Noel E. O'Connor,et al.  Speech-music discrimination from MPEG-1 bitstream , 2001 .

[7]  Tsuhan Chen,et al.  Audio feature extraction and analysis for scene classification , 1997, Proceedings of First Signal Processing Society Workshop on Multimedia Signal Processing.

[8]  Eric D. Scheirer,et al.  Tempo and beat analysis of acoustic musical signals. , 1998, The Journal of the Acoustical Society of America.

[9]  Nilesh V. Patel,et al.  Audio characterization for video indexing , 1996, Electronic Imaging.

[10]  Malcolm Slaney,et al.  Construction and evaluation of a robust multifeature speech/music discriminator , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[11]  John Saunders,et al.  Real-time discrimination of broadcast speech/music , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[12]  Shingo Uchihashi,et al.  The beat spectrum: a new approach to rhythm analysis , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..