Audio Segmentation via Tri-Model Bayesian Information Criterion

This paper addresses the problem of audio segmentation in practical media (e.g. TV series, movies and etc.) which usually consists of segments in various lengths with quite a portion of short ones. An unsupervised audio segmentation approach is presented, including a segmentation-stage to detect potential acoustic changes, and a refinement-stage to refine these candidate changes by a tri-model Bayesian information criterion. Experiments show that the proposed approach has good detectability of short segments and the novel tri-model BIC effectively improves the overall segmentation performance.

[1]  M. A. Siegler,et al.  Automatic Segmentation, Classification and Clustering of Broadcast News Audio , 1997 .

[2]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[3]  Alexander H. Waibel,et al.  Strategies for automatic segmentation of audio data , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[4]  Lie Lu,et al.  Speaker change detection and tracking in real-time news broadcasting analysis , 2002, MULTIMEDIA '02.

[5]  John H. L. Hansen,et al.  Advances in unsupervised audio segmentation for the broadcast news and NGSW corpora , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  Seiichi Nakagawa,et al.  Speaker change detection and speaker clustering using VQ distortion for broadcast news speech recognition , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[7]  Lie Lu,et al.  A robust audio classification and segmentation method , 2001, MULTIMEDIA '01.

[8]  Jean-Pierre Martens,et al.  A fast, accurate and stream-based speaker segmentation and clustering algorithm , 2003, INTERSPEECH.

[9]  Christian Wellekens,et al.  DISTBIC: A speaker-based segmentation for audio data indexing , 2000, Speech Commun..

[10]  S. Chen,et al.  Speaker, Environment and Channel Change Detection and Clustering via the Bayesian Information Criterion , 1998 .

[11]  Vlasta Radová,et al.  Modified DISTBIC algorithm for speaker change detection , 2005, INTERSPEECH.