Efficient voice activity detection algorithm based on sub-band temporal envelope and sub-band long-term signal variability
暂无分享,去创建一个
Bin Liu | Ya Li | Jianhua Tao | Zhengqi Wen | Shanfeng Liu | Fuyuan Mo
[1] John S. Collura,et al. MELP: the new Federal Standard at 2400 bps , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[2] I. Johansson,et al. The adaptive multi-rate speech coder , 1999, 1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351).
[3] Jr. S. Marple,et al. Computing the discrete-time 'analytic' signal via FFT , 1999, Conference Record of the Thirty-First Asilomar Conference on Signals, Systems and Computers (Cat. No.97CB36136).
[4] Shrikanth S. Narayanan,et al. Robust Voice Activity Detection Using Long-Term Signal Variability , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[5] Carla Teixeira Lopes,et al. TIMIT Acoustic-Phonetic Continuous Speech Corpus , 2012 .
[6] Andreas Stolcke,et al. Multispeaker speech activity detection for the ICSI meeting recorder , 2001, IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01..
[7] M. Gabrea,et al. Correlation coefficient-based voice activity detector algorithm , 2004, Canadian Conference on Electrical and Computer Engineering 2004 (IEEE Cat. No.04CH37513).
[8] Spyridon Matsoukas,et al. Developing a Speech Activity Detection System for the DARPA RATS Program , 2012, INTERSPEECH.
[9] Bowon Lee. MINIMUM MEAN-SQUARED ERROR A POSTERIORI ESTIMATION OF HIGH VARIANCE VEHICULAR NOISE , .
[10] Chungyong Lee,et al. Robust voice activity detection algorithm for estimating noise spectrum , 2000 .
[11] Andrzej Drygajlo,et al. Entropy based voice activity detection in very noisy conditions , 2001, INTERSPEECH.
[12] Yoshihiko Nankaku,et al. Voice activity detection based on conditional random fields using multiple features , 2010, INTERSPEECH.
[13] G. Clark,et al. Reference , 2008 .
[14] Ananya Misra,et al. Speech/Nonspeech Segmentation in Web Videos , 2012, INTERSPEECH.
[15] Wonyong Sung,et al. A statistical model-based voice activity detection , 1999, IEEE Signal Processing Letters.
[16] E. Shlomot,et al. ITU-T Recommendation G.729 Annex B: a silence compression scheme for use with G.729 optimized for V.70 digital simultaneous voice and data applications , 1997, IEEE Commun. Mag..
[17] Sang-Sik Ahn,et al. Statistical Model-Based VAD Algorithm with Wavelet Transform , 2006, IEICE Trans. Fundam. Electron. Commun. Comput. Sci..
[18] Björn W. Schuller,et al. Real-life voice activity detection with LSTM Recurrent Neural Networks and an application to Hollywood movies , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[19] K. Shikano,et al. Noise estimation using negentropy based voice-activity detector , 2004, The 2004 47th Midwest Symposium on Circuits and Systems, 2004. MWSCAS '04..
[20] S. Casale,et al. Performance evaluation and comparison of G.729/AMR/fuzzy voice activity detectors , 2002, IEEE Signal Processing Letters.
[21] Jonathan G. Fiscus,et al. Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .
[22] Douglas A. Reynolds,et al. An overview of automatic speaker recognition technology , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[23] Narimene Lezzoum,et al. A low-complexity voice activity detector for smart hearing protection of hyperacusic persons , 2013, INTERSPEECH.
[24] Nima Mesgarani,et al. Discrimination of speech from nonspeech based on multiscale spectro-temporal Modulations , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[25] Xianglong Liu,et al. An improved noise-robust voice activity detector based on hidden semi-Markov models , 2011, Pattern Recognit. Lett..