Boosting Contextual Information for Deep Neural Network Based Voice Activity Detection
暂无分享,去创建一个
[1] Nitish Srivastava,et al. Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.
[2] Yunde Jia,et al. Voice Activity Detection Via Noise Reducing Using Non-Negative Sparse Coding , 2013, IEEE Signal Processing Letters.
[3] Bayya Yegnanarayana,et al. Single Frequency Filtering Approach for Discriminating Speech and Nonspeech , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[4] Hui Zhang,et al. Deep stacking networks with time series for speech separation , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[5] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[6] Leo Breiman,et al. Random Forests , 2001, Machine Learning.
[7] Sanjit K. Mitra,et al. Voice activity detection based on multiple statistical models , 2006, IEEE Transactions on Signal Processing.
[8] Dong Yu,et al. Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[9] Dong Enqing,et al. Applying support vector machines to voice activity detection , 2002, 6th International Conference on Signal Processing, 2002..
[10] Juan Manuel Górriz,et al. Hard C-means clustering for voice activity detection , 2006, Speech Commun..
[11] Guy J. Brown,et al. Computational Auditory Scene Analysis: Principles, Algorithms, and Applications , 2006 .
[12] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.
[13] Javier Ramírez,et al. Statistical voice activity detection using a multiple observation likelihood ratio test , 2005, IEEE Signal Processing Letters.
[14] Brian Kingsbury,et al. Improvements to the IBM speech activity detection system for the DARPA RATS program , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[15] Tara N. Sainath,et al. Improving deep neural networks for LVCSR using rectified linear units and dropout , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[16] WangDeLiang,et al. Boosting contextual information for deep neural network based voice activity detection , 2016 .
[17] Joon-Hyuk Chang,et al. Voice activity detection based on statistical models and machine learning approaches , 2010, Comput. Speech Lang..
[18] Dong Yu,et al. The Deep Tensor Neural Network With Applications to Large Vocabulary Speech Recognition , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[19] Sridhar Krishna Nemala,et al. A Multistream Feature Framework Based on Bandpass Modulation Filtering for Robust Speech Recognition , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[20] DeLiang Wang,et al. A Feature Study for Classification-Based Speech Separation at Low Signal-to-Noise Ratios , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[21] Israel Cohen,et al. Voice Activity Detection in Presence of Transient Noise Using Spectral Clustering , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[22] Leo Breiman,et al. Bagging Predictors , 1996, Machine Learning.
[23] Spyridon Matsoukas,et al. Developing a Speech Activity Detection System for the DARPA RATS Program , 2012, INTERSPEECH.
[25] Marc'Aurelio Ranzato,et al. Large Scale Distributed Deep Networks , 2012, NIPS.
[26] Zheng-Hua Tan,et al. Low-Complexity Variable Frame Rate Analysis for Speech Recognition and Voice Activity Detection , 2010, IEEE Journal of Selected Topics in Signal Processing.
[27] DeLiang Wang,et al. Neural Network Based Pitch Tracking in Very Noisy Speech , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[28] Ephraim. Speech enhancement using a minimum mean square error short-time spectral amplitude estimator , 1984 .
[29] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.
[30] Thomas G. Dietterich. Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.
[31] David Pearce,et al. The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions , 2000, INTERSPEECH.
[32] Jianwu Dang,et al. Voice Activity Detection Based on an Unsupervised Learning Framework , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[33] DeLiang Wang,et al. Towards Scaling Up Classification-Based Speech Separation , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[34] Thad Hughes,et al. Recurrent neural networks for voice activity detection , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[35] S. Boll,et al. Suppression of acoustic noise in speech using spectral subtraction , 1979 .
[36] Björn W. Schuller,et al. Real-life voice activity detection with LSTM Recurrent Neural Networks and an application to Hollywood movies , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[37] Joon-Hyuk Chang,et al. Dual-Microphone Voice Activity Detection Technique Based on Two-Step Power Level Difference Ratio , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[38] Xiao-Lei Zhang. Unsupervised domain adaptation for deep neural network based voice activity detection , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[39] Wonyong Sung,et al. A statistical model-based voice activity detection , 1999, IEEE Signal Processing Letters.
[40] Rafik A. Goubran,et al. Robust voice activity detection using higher-order statistics in the LPC residual domain , 2001, IEEE Trans. Speech Audio Process..
[41] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.
[42] DeLiang Wang,et al. Auditory Segmentation Based on Onset and Offset Analysis , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[43] Kevin Walker,et al. The RATS radio traffic collection system , 2012, Odyssey.
[44] E. Shlomot,et al. ITU-T Recommendation G.729 Annex B: a silence compression scheme for use with G.729 optimized for V.70 digital simultaneous voice and data applications , 1997, IEEE Commun. Mag..
[45] DeLiang Wang,et al. Boosted deep neural networks and multi-resolution cochleagram features for voice activity detection , 2014, INTERSPEECH.
[46] R. Schapire. The Strength of Weak Learnability , 1990, Machine Learning.
[47] Wei Zhang,et al. A soft voice activity detector based on a Laplacian-Gaussian model , 2003, IEEE Trans. Speech Audio Process..
[48] Ian Vince McLoughlin. The use of low-frequency ultrasound for voice activity detection , 2014, INTERSPEECH.
[49] DeLiang Wang,et al. Deep Neural Network Based Supervised Speech Segregation Generalizes to Novel Noises through Large-scale Training , 2015 .
[50] Geoffrey E. Hinton,et al. On the importance of initialization and momentum in deep learning , 2013, ICML.
[51] Xiao-Lei Zhang,et al. Deep Belief Networks Based Voice Activity Detection , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[52] John H. L. Hansen,et al. Unsupervised Speech Activity Detection Using Voicing Measures and Perceptual Spectral Flux , 2013, IEEE Signal Processing Letters.