Audio-Visual Speech Recognition Using Bimodal-Trained Bottleneck Features for a Person with Severe Hearing Loss
暂无分享,去创建一个
Tetsuya Takiguchi | Yasuo Ariki | Kaoru Nakazono | Kiyohiro Omori | Ryo Aihara | Yuki Takashima | Nobuyuki Mitani | T. Takiguchi | Yuki Takashima | Y. Ariki | K. Omori | Ryo Aihara | Kaoru Nakazono | Nobuyuki Mitani
[1] Tetsuya Takiguchi,et al. Integration of Metamodel and Acoustic Model for Dysarthric Speech Recognition , 2009, J. Multim..
[2] Timothy F. Cootes,et al. Feature Detection and Tracking with Constrained Local Models , 2006, BMVC.
[3] Nobuo Ezaki,et al. Text detection from natural scene images: towards a system for visually impaired persons , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..
[4] G. Montavon. Deep learning for spoken language identification , 2009 .
[5] Tetsuya Takiguchi,et al. Dysarthric speech recognition using a convolutive bottleneck network , 2014, 2014 12th International Conference on Signal Processing (ICSP).
[6] Juhan Nam,et al. Multimodal Deep Learning , 2011, ICML.
[7] Ashish Verma,et al. LATE INTEGRATION IN AUDIO-VISUAL CONTINUOUS SPEECH RECOGNITION , 1999 .
[8] Yann LeCun,et al. Learning long‐range vision for autonomous off‐road driving , 2009, J. Field Robotics.
[9] Vaibhava Goel,et al. Deep multimodal learning for Audio-Visual Speech Recognition , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.
[11] Martin Karafiát,et al. Convolutive Bottleneck Network features for LVCSR , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.
[12] Tetsuya Takiguchi,et al. Multimodal speech recognition of a person with articulation disorders using AAM and MAF , 2010, 2010 IEEE International Workshop on Multimedia Signal Processing.
[13] Tetsuya Takiguchi,et al. Audio-Visual Speech Recognition Using Convolutive Bottleneck Networks for a Person with Severe Hearing Loss , 2015, IPSJ Trans. Comput. Vis. Appl..
[14] Satoshi Tamura,et al. Integration of deep bottleneck features for audio-visual speech recognition , 2015, INTERSPEECH.
[15] Martin J. Russell,et al. Integrating audio and visual information to provide highly robust speech recognition , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.
[16] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[17] Gerasimos Potamianos,et al. Discriminative training of HMM stream exponents for audio-visual speech recognition , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[18] Christophe Garcia,et al. text Detection with Convolutional Neural Networks , 2008, VISAPP.
[19] Ying Wu,et al. Capturing human hand motion in image sequences , 2002, Workshop on Motion and Video Computing, 2002. Proceedings..
[20] Christophe Garcia,et al. Convolutional face finder: a neural architecture for fast and robust face detection , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[21] Tetsuya Takiguchi,et al. Local-feature-map Integration Using Convolutional Neural Networks for Music Genre Classification , 2012, INTERSPEECH.