Energy-based voice activity detection algorithm using Gaussian and Cauchy kernels

In this work we present a simple and robust Energy-Based Voice Activity Detection Algorithm using Kernel (KVAD). Taking advantage of kernel metrics, Gaussian and Cauchy kernels are used to classify acoustic signatures as speech and non-speech. As an evidence of the potentiality of KVAD algorithm, comparisons with existing energy-based algorithms are presented, showing better performance in adverse environments as low signal-to-noise ratio (SNR) and non-stationary noise.

[1]  Ignas Niemegeers,et al.  Voice Activity Detection for VoIP—An Information Theoretic Approach , 2006 .

[2]  Israel Cohen,et al.  Kernel Method for Voice Activity Detection in the Presence of Transients , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[3]  Spectral Entropy SPECTRAL ENTROPY: AN ALTERNATIVE INDICATOR FOR RATE ALLOCATION? , 1994 .

[4]  Ralf Herbrich,et al.  Learning Kernel Classifiers: Theory and Algorithms , 2001 .

[5]  A. Atiya,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2005, IEEE Transactions on Neural Networks.

[6]  Paul Boersma,et al.  Praat, a system for doing phonetics by computer , 2002 .

[7]  Kirill Sakhnov,et al.  Approach for Energy-Based Voice Detector with Adaptive Scaling Factor , 2009 .

[8]  Joon-Hyuk Chang,et al.  Statistical voice activity detection in kernel space. , 2012, The Journal of the Acoustical Society of America.

[9]  S.M. Ahadi,et al.  Voice Activity Detection based on Combination of Multiple Features using Linear/Kernel Discriminant Analyses , 2008, 2008 3rd International Conference on Information and Communication Technologies: From Theory to Applications.

[10]  I. Cohen,et al.  Generating nonstationary multisensor signals under a spatial coherence constraint. , 2008, The Journal of the Acoustical Society of America.

[11]  Philipos C. Loizou,et al.  Speech Enhancement: Theory and Practice , 2007 .

[12]  Hadi Veisi,et al.  Hidden-Markov-model-based voice activity detector with high speech detection rate for speech enhancement , 2012, IET Signal Process..

[13]  Jean-Claude Junqua,et al.  A study of endpoint detection algorithms in adverse conditions: incidence on a DTW and HMM recognizer , 1991, EUROSPEECH.

[14]  Masakiyo Fujimoto,et al.  Noise robust voice activity detection based on periodic to aperiodic component ratio , 2010, Speech Commun..

[15]  R. Isotani,et al.  Regularization in a reproducing kernel hubert space for robust voice activity detection , 2010, IEEE 10th INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS.

[16]  Joon-Hyuk Chang,et al.  Statistical model-based voice activity detection using support vector machine , 2009 .

[17]  N. Aronszajn Theory of Reproducing Kernels. , 1950 .

[18]  Cédric Richard,et al.  Stochastic Behavior Analysis of the Gaussian Kernel Least-Mean-Square Algorithm , 2012, IEEE Trans. Signal Process..

[19]  Eun-Kyoung Kim,et al.  Enhanced voice activity detection using acoustic event detection and classification , 2011, IEEE Transactions on Consumer Electronics.

[20]  Wei Jiang,et al.  Hybrid SVM/HMM architectures for statistical model-based voice activity detection , 2014, 2014 International Joint Conference on Neural Networks (IJCNN).

[21]  A. Akbari,et al.  A model-based voice Activity Detection algorithm using probabilistic neural networks , 2008, 2008 14th Asia-Pacific Conference on Communications.

[22]  Ronald W. Schafer,et al.  Introduction to Digital Speech Processing , 2007, Found. Trends Signal Process..

[23]  Jayanta Basak,et al.  A least square kernel machine with box constraints , 2008, 2008 19th International Conference on Pattern Recognition.

[24]  Buket D. Barkana,et al.  Voiced/Unvoiced Decision for Speech Signals Based on Zero-Crossing Rate and Energy , 2008, SCSS.