Voice activity detection in transient noise environment using Laplacian pyramid algorithm

Voice activity detection (VAD) has attracted significant research efforts in the last two decades. Despite much progress in designing voice activity detectors, voice activity detection in presence of transient noise and low SNR is a challenging problem. In this paper, we propose a new VAD algorithm based on supervised learning. Our method employs Laplacian pyramid algorithm as a tool for function extension. We estimate the likelihood ratio function of unlabeled data, by extending the likelihood ratios obtained from the labeled data. Simulation results demonstrate the advantages of the proposed method in transient noise environments over conventional statistical methods.

[1]  Joon-Hyuk Chang,et al.  Voice activity detection based on generalized gamma distribution , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[2]  Gene H. Golub,et al.  Matrix computations , 1983 .

[3]  Israel Cohen,et al.  Voice Activity Detection in Presence of Transient Noise Using Spectral Clustering , 2013, IEEE Transactions on Audio, Speech, and Language Processing.

[4]  Michael I. Jordan,et al.  Learning Spectral Clustering, With Application To Speech Separation , 2006, J. Mach. Learn. Res..

[5]  Ronald R. Coifman,et al.  Heterogeneous Datasets Representation and Learning using Diffusion Maps and Laplacian Pyramids , 2012, SDM.

[6]  Javier Ramírez,et al.  Statistical voice activity detection using a multiple observation likelihood ratio test , 2005, IEEE Signal Processing Letters.

[7]  I. Cohen,et al.  AR-GARCH in Presence of Noise: Parameter Estimation and Its Application to Voice Activity Detection , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[8]  Joon-Hyuk Chang,et al.  Voice activity detection based on complex Laplacian model , 2003 .

[9]  Wonyong Sung,et al.  A statistical model-based voice activity detection , 1999, IEEE Signal Processing Letters.

[10]  Sven Nordholm,et al.  Statistical Voice Activity Detection Using Low-Variance Spectrum Estimation and an Adaptive Threshold , 2006, IEEE Transactions on Audio, Speech, and Language Processing.