Audio-Visual Tibetan Speech Recognition Based on a Deep Dynamic Bayesian Network for Natural Human Robot Interaction:
暂无分享,去创建一个
[1] Stuart J. Russell,et al. Dynamic bayesian networks: representation, inference and learning , 2002 .
[2] Qiang Ji,et al. Efficient Structure Learning of Bayesian Networks using Constraints , 2011, J. Mach. Learn. Res..
[3] Geoffrey E. Hinton,et al. Phone Recognition with the Mean-Covariance Restricted Boltzmann Machine , 2010, NIPS.
[4] Honglak Lee,et al. Unsupervised feature learning for audio classification using convolutional deep belief networks , 2009, NIPS.
[5] Nir Friedman,et al. The Bayesian Structural EM Algorithm , 1998, UAI.
[6] Kate Saenko,et al. AN ASYNCHRONOUS DBN FOR AUDIO-VISUAL SPEECH RECOGNITION , 2006, 2006 IEEE Spoken Language Technology Workshop.
[7] Alex Pentland,et al. Coupled hidden Markov models for complex action recognition , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
[8] Juhan Nam,et al. Multimodal Deep Learning , 2011, ICML.
[9] Kevin P. Murphy,et al. A coupled HMM for audio-visual speech recognition , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[10] Jeff A. Bilmes,et al. DBN based multi-stream models for speech , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..
[11] P. Ladefoged,et al. Factor analysis of tongue shapes. , 1971, Journal of the Acoustical Society of America.
[12] Jeff A. Bilmes,et al. DBN based multi-stream models for audio-visual speech recognition , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[13] Qiang Ji,et al. Learning discriminant features for multi-view face and eye detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).
[14] Mark Hasegawa-Johnson,et al. Acoustic segmentation using switching state Kalman filter , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..
[15] Lv Guo-yun. Research on DBN-based continuous speech recognition and phoneme segment , 2007 .
[16] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .
[17] Yang Wang,et al. Robust facial feature tracking under varying face pose and facial expression , 2007, Pattern Recognit..