Boosted deep neural networks and multi-resolution cochleagram features for voice activity detection
暂无分享,去创建一个
[1] Marc'Aurelio Ranzato,et al. Large Scale Distributed Deep Networks , 2012, NIPS.
[2] Mark Liberman,et al. Speech activity detection on youtube using deep neural networks , 2013, INTERSPEECH.
[3] Wonyong Sung,et al. A statistical model-based voice activity detection , 1999, IEEE Signal Processing Letters.
[4] Yunde Jia,et al. Voice Activity Detection Via Noise Reducing Using Non-Negative Sparse Coding , 2013, IEEE Signal Processing Letters.
[5] DeLiang Wang,et al. A feature study for classification-based speech separation at very low signal-to-noise ratio , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[6] Guy J. Brown,et al. Computational Auditory Scene Analysis: Principles, Algorithms, and Applications , 2006 .
[7] Geoffrey E. Hinton,et al. On the importance of initialization and momentum in deep learning , 2013, ICML.
[8] Xiao-Lei Zhang,et al. Deep Belief Networks Based Voice Activity Detection , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[9] Joon-Hyuk Chang,et al. Voice activity detection based on statistical models and machine learning approaches , 2010, Comput. Speech Lang..
[10] Sridhar Krishna Nemala,et al. A Multistream Feature Framework Based on Bandpass Modulation Filtering for Robust Speech Recognition , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[11] John H. L. Hansen,et al. Discriminative Training for Multiple Observation Likelihood Ratio Based Voice Activity Detection , 2010, IEEE Signal Processing Letters.
[12] Hoirin Kim,et al. Multiple Acoustic Model-Based Discriminative Likelihood Ratio Weighting for Voice Activity Detection , 2012, IEEE Signal Processing Letters.
[13] DeLiang Wang,et al. Auditory Segmentation Based on Onset and Offset Analysis , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[14] Jianwu Dang,et al. Voice Activity Detection Based on an Unsupervised Learning Framework , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[15] John H. L. Hansen,et al. Robust front-end processing for speaker identification over extremely degraded communication channels , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[16] Thad Hughes,et al. Recurrent neural networks for voice activity detection , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[17] Athanasios Katsamanis,et al. Multi-band long-term signal variability features for robust voice activity detection , 2013, INTERSPEECH.
[18] Israel Cohen,et al. Voice Activity Detection in Presence of Transient Noise Using Spectral Clustering , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[19] Javier Ramírez,et al. Statistical voice activity detection using a multiple observation likelihood ratio test , 2005, IEEE Signal Processing Letters.
[20] Ramjee Prasad,et al. Convex Combination of Multiple Statistical Models With Application to VAD , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[21] Yoshihiko Nankaku,et al. Voice activity detection based on conditional random fields using multiple features , 2010, INTERSPEECH.
[22] Dong Yu,et al. Deep-structured hidden conditional random fields for phonetic recognition , 2010, INTERSPEECH.
[23] Thomas G. Dietterich. Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.
[24] Tara N. Sainath,et al. Improving deep neural networks for LVCSR using rectified linear units and dropout , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.