Bimodal Recurrent Neural Network for Audiovisual Voice Activity Detection
暂无分享,去创建一个
Carlos Busso | Fei Tao | C. Busso | Fei Tao
[1] Petros Maragos,et al. Adaptive Multimodal Fusion by Uncertainty Compensation With Application to Audiovisual Speech Recognition , 2009, IEEE Transactions on Audio, Speech, and Language Processing.
[2] Israel Cohen,et al. Audio-Visual Voice Activity Detection Using Diffusion Maps , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[3] DeLiang Wang,et al. Boosting Contextual Information for Deep Neural Network Based Voice Activity Detection , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[4] DeLiang Wang,et al. Boosted deep neural networks and multi-resolution cochleagram features for voice activity detection , 2014, INTERSPEECH.
[5] Christian Jutten,et al. Two novel visual voice activity detectors based on appearance models and retinal filtering , 2007, 2007 15th European Signal Processing Conference.
[6] Björn W. Schuller,et al. Real-life voice activity detection with LSTM Recurrent Neural Networks and an application to Hollywood movies , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[7] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[8] John H. L. Hansen,et al. Unsupervised Speech Activity Detection Using Voicing Measures and Perceptual Spectral Flux , 2013, IEEE Signal Processing Letters.
[9] Aristodemos Pnevmatikakis,et al. Voice activity detection using audio-visual information , 2009, 2009 16th International Conference on Digital Signal Processing.
[10] John H. L. Hansen,et al. An unsupervised visual-only voice activity detection approach using temporal orofacial features , 2015, INTERSPEECH.
[11] Thad Hughes,et al. Recurrent neural networks for voice activity detection , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[12] Peng Liu,et al. Voice activity detection using visual information , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[13] John H. L. Hansen,et al. Improving Boundary Estimation in Audiovisual Speech Activity Detection Using Bayesian Information Criterion , 2016, INTERSPEECH.
[14] Xiao-Lei Zhang,et al. Deep Belief Networks Based Voice Activity Detection , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[15] Emiel Krahmer,et al. Visual voice activity detection at different speeds , 2013, AVSP.
[16] Carlos Busso,et al. Lipreading approach for isolated digits recognition under whisper and neutral speech , 2014, INTERSPEECH.
[17] Tara N. Sainath,et al. Feature Learning with Raw-Waveform CLDNNs for Voice Activity Detection , 2016, INTERSPEECH.
[18] Israel Cohen,et al. Adaptive weighting parameter in audio-visual voice activity detection , 2016, 2016 IEEE International Conference on the Science of Electrical Engineering (ICSEE).
[19] Yoni Bauduin,et al. Audio-Visual Speech Recognition , 2004 .
[20] Yoshua. Bengio,et al. Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..
[21] Satoshi Tamura,et al. Voice activity detection based on fusion of audio and visual information , 2009, AVSP.
[22] Mark Liberman,et al. Speech activity detection on youtube using deep neural networks , 2013, INTERSPEECH.
[23] Juhan Nam,et al. Multimodal Deep Learning , 2011, ICML.
[24] Paul Over,et al. Creating HAVIC: Heterogeneous Audio Visual Internet Collection , 2012, LREC.
[25] Ben P. Milner,et al. Using audio-visual features for robust voice activity detection in clean and noisy speech , 2008, 2008 16th European Signal Processing Conference.
[26] Yoshua Bengio,et al. End-to-end attention-based large vocabulary speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[27] John H. L. Hansen,et al. Audio-visual isolated digit recognition for whispered speech , 2011, 2011 19th European Signal Processing Conference.