Lipreading using convolutional neural network
暂无分享,去创建一个
Tetsuya Ogata | Hiroshi G. Okuno | Kazuhiro Nakadai | Yuki Yamaguchi | Kuniaki Noda | T. Ogata | K. Nakadai | K. Noda | HIroshi G. Okuno | Yuki Yamaguchi
[1] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition , 2012 .
[2] Barry-John Theobald,et al. Improving visual features for lip-reading , 2010, AVSP.
[3] Juergen Luettin,et al. Visual speech recognition using active shape models and hidden Markov models , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.
[4] Shigeru Katagiri,et al. Construction of a large-scale Japanese speech database and its management system , 1989, International Conference on Acoustics, Speech, and Signal Processing,.
[5] Aggelos K. Katsaggelos,et al. Comparison of low- and high-level visual features for audio-visual continuous automatic speech recognition , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[6] Honglak Lee,et al. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.
[7] Y. LeCun,et al. Learning methods for generic object recognition with invariance to pose and lighting , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..
[8] Hiroshi G. Okuno,et al. Automatic speech recognition improved by two-layered audio-visual integration for robot audition , 2009, 2009 9th IEEE-RAS International Conference on Humanoid Robots.
[9] Timothy F. Cootes,et al. Extraction of Visual Features for Lipreading , 2002, IEEE Trans. Pattern Anal. Mach. Intell..
[10] Juergen Luettin,et al. A comparison of model and transform-based visual features for audio-visual LVCSR , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..
[11] Richard B. Reilly,et al. Feature analysis for automatic speechreading , 2001, 2001 IEEE Fourth Workshop on Multimedia Signal Processing (Cat. No.01TH8564).
[12] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[13] Marc'Aurelio Ranzato,et al. Building high-level features using large scale unsupervised learning , 2011, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[14] Hui Jiang,et al. Rapid and effective speaker adaptation of convolutional neural network based models for speech recognition , 2013, INTERSPEECH.
[15] Dimitri Palaz,et al. Estimating phoneme class conditional probabilities from raw speech signal using convolutional neural networks , 2013, INTERSPEECH.
[16] Timothy F. Cootes,et al. Active Appearance Models , 2001, IEEE Trans. Pattern Anal. Mach. Intell..
[17] Jon Barker,et al. Evidence of correlation between acoustic and visual features of speech , 1999 .
[18] Hani Yehia,et al. Quantitative association of vocal-tract and facial behavior , 1998, Speech Commun..
[19] Yoshua. Bengio,et al. Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..