Integration of deep bottleneck features for audio-visual speech recognition
暂无分享,去创建一个
Satoshi Tamura | Kazuya Takeda | Norihide Kitaoka | Yurie Iribe | Hiroshi Ninomiya | K. Takeda | N. Kitaoka | S. Tamura | H. Ninomiya | Y. Iribe
[1] Satoshi Nakamura,et al. CENSREC-1-AV: an audio-visual corpus for noisy bimodal speech recognition , 2010, AVSP.
[2] Tsuhan Chen,et al. Audio-visual integration in multimodal communication , 1998, Proc. IEEE.
[3] Jing Huang,et al. Audio-visual deep learning for noise robust speech recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[4] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition , 2012 .
[5] Yoshua Bengio,et al. Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.
[6] Juhan Nam,et al. Multimodal Deep Learning , 2011, ICML.
[7] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.
[8] Thomas Hofmann,et al. Greedy Layer-Wise Training of Deep Networks , 2007 .
[9] Geoffrey E. Hinton,et al. Acoustic Modeling Using Deep Belief Networks , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[10] Florian Metze,et al. Extracting deep bottleneck features using stacked auto-encoders , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[11] Vaibhava Goel,et al. Efficient likelihood computation in multi-stream HMM based audio-visual speech recognition , 2004, Interspeech.
[12] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[13] Tetsuya Ogata,et al. Audio-visual speech recognition using deep learning , 2014, Applied Intelligence.
[14] Geoffrey E. Hinton,et al. Reducing the Dimensionality of Data with Neural Networks , 2006, Science.
[15] Dong Yu,et al. Conversational Speech Transcription Using Context-Dependent Deep Neural Networks , 2012, ICML.
[16] Dong Yu,et al. Improved Bottleneck Features Using Pretrained Deep Neural Networks , 2011, INTERSPEECH.
[17] Juergen Luettin,et al. Audio-Visual Speech Modeling for Continuous Speech Recognition , 2000, IEEE Trans. Multim..
[18] Petros Maragos,et al. Adaptive multimodal fusion by uncertainty compensation , 2006, INTERSPEECH.