Multi-pose lipreading and audio-visual speech recognition
暂无分享,去创建一个
[1] Richard Bellman,et al. Adaptive Control Processes: A Guided Tour , 1961, The Mathematical Gazette.
[2] S. Furui,et al. Cepstral analysis technique for automatic speaker verification , 1981 .
[3] P. Mermelstein,et al. Distance measures for speech recognition, psychological and instrumental , 1976 .
[4] Ieee Xplore,et al. IEEE Transactions on Pattern Analysis and Machine Intelligence Information for Authors , 2022, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[5] Stan Davis,et al. Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .
[6] L. Rabiner,et al. An introduction to hidden Markov models , 1986, IEEE ASSP Magazine.
[7] R. Campbell,et al. Hearing by eye : the psychology of lip-reading , 1988 .
[8] David Taylor. Hearing by Eye: The Psychology of Lip-Reading , 1988 .
[9] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.
[10] H Hermansky,et al. Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.
[11] Milan Sonka,et al. Image Processing, Analysis and Machine Vision , 1993, Springer US.
[12] Biing-Hwang Juang,et al. Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.
[13] Roberto Battiti,et al. Using mutual information for selecting features in supervised neural net learning , 1994, IEEE Trans. Neural Networks.
[14] Yochai Konig,et al. "Eigenlips" for robust speech recognition , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.
[15] David Beymer,et al. Face recognition under varying pose , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.
[16] Hynek Hermansky,et al. RASTA processing of speech , 1994, IEEE Trans. Speech Audio Process..
[17] Steve Young,et al. The HTK book , 1995 .
[18] Christopher M. Bishop,et al. Neural networks for pattern recognition , 1995 .
[19] D. Stork,et al. Speechreading by Man and Machine: Models, Systems, and Applications , 1996 .
[20] A. Adjoudani,et al. On the Integration of Auditory and Visual Parameters in an HMM-based ASR , 1996 .
[21] Javier R. Movellan,et al. Channel Separability in the Audio-Visual Integration of Speech: A Bayesian Approach , 1996 .
[22] Martin J. Russell,et al. Integrating audio and visual information to provide highly robust speech recognition , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.
[23] David G. Stork,et al. Speechreading by Humans and Machines , 1996 .
[24] 平山亮. 会議報告-Speechreading by Humans and Machines; Models Systems and Applications , 1997 .
[25] Jiri Matas,et al. On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..
[26] Gerasimos Potamianos,et al. An image transform approach for HMM based automatic lipreading , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).
[27] David G. Stork,et al. Speech recognition and sensory integration , 1998 .
[28] Jeff A. Bilmes,et al. Dynamic classifier combination in hybrid speech recognition systems using utterance-level confidence values , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).
[29] Juergen Luettin,et al. Audio-Visual Speech Modeling for Continuous Speech Recognition , 2000, IEEE Trans. Multim..
[30] Hervé Glotin,et al. Large-vocabulary audio-visual speech recognition: a summary of the Johns Hopkins Summer 2000 Workshop , 2001, 2001 IEEE Fourth Workshop on Multimedia Signal Processing (Cat. No.01TH8564).
[31] Sharon M. Thomas,et al. Effects of horizontal viewing angle on visual and audiovisual speech recognition. , 2001, Journal of experimental psychology. Human perception and performance.
[32] Juergen Luettin,et al. Hierarchical discriminant features for audio-visual LVCSR , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[33] Tsuhan Chen,et al. Audiovisual speech processing , 2001, IEEE Signal Process. Mag..
[34] Marcos Dipinto,et al. Discriminant analysis , 2020, Predictive Analytics.
[35] Mark A. Clements,et al. Automatic Speechreading with Applications to Human-Computer Interfaces , 2002, EURASIP J. Adv. Signal Process..
[36] Sabri Gurbuz,et al. Moving-Talker, Speaker-Independent Feature Study, and Baseline Results Using the CUAVE Multimodal Speech Corpus , 2002, EURASIP J. Adv. Signal Process..
[37] Chalapathy Neti,et al. Audio-visual speech recognition in challenging environments , 2003, INTERSPEECH.
[38] Chalapathy Neti,et al. Recent advances in the automatic recognition of audiovisual speech , 2003, Proc. IEEE.
[39] Daniel P. W. Ellis,et al. Using mutual information to design class-specific phone recognizers , 2003, INTERSPEECH.
[40] Thomas Vetter,et al. Face Recognition Based on Fitting a 3D Morphable Model , 2003, IEEE Trans. Pattern Anal. Mach. Intell..
[41] Surendra Ranganath,et al. Pose-invariant face recognition using a 3D deformable model , 2003, Pattern Recognit..
[42] Hermann Ney,et al. Bootstrap estimates for confidence intervals in ASR performance evaluation , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[43] Ralph Gross,et al. Appearance-based face recognition and light-fields , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[44] Juergen Luettin,et al. Audio-Visual Automatic Speech Recognition: An Overview , 2004 .
[45] Jian Zhang,et al. Analysis of lip geometric features for audio-visual speech recognition , 2004, IEEE Trans. Syst. Man Cybern. Part A.
[46] F. Fleuret. Fast Binary Feature Selection with Conditional Mutual Information , 2004, J. Mach. Learn. Res..
[47] Thomas Vetter,et al. Synthesis of Novel Views from a Single Face Image , 1998, International Journal of Computer Vision.
[48] Fuhui Long,et al. Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[49] P. Jonathon Phillips,et al. Face recognition based on frontal views generated from non-frontal images , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).
[50] A. Murat Tekalp,et al. Discriminative Analysis of Lip Motion Features for Speaker Identification and Speech-Reading , 2006, IEEE Transactions on Image Processing.
[51] Sridha Sridharan,et al. A unified approach to multi-pose audio-visual ASR , 2007, INTERSPEECH.
[52] Wen Gao,et al. Locally Linear Regression for Pose-Invariant Face Recognition , 2007, IEEE Transactions on Image Processing.
[53] Nasser M. Nasrabadi,et al. Pattern Recognition and Machine Learning , 2006, Technometrics.
[54] Sridha Sridharan,et al. An extended pose-invariant lipreading system , 2007, AVSP.
[55] Simon King,et al. Articulatory Feature-Based Methods for Acoustic and Audio-Visual Speech Recognition: Summary from the 2006 JHU Summer workshop , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.
[56] Sridha Sridharan,et al. Continuous pose-invariant lipreading , 2008, INTERSPEECH.
[57] Jean-Philippe Thiran,et al. Information Theoretic Feature Extraction for Audio-Visual Speech Recognition , 2009, IEEE Transactions on Signal Processing.
[58] Jean-Philippe Thiran,et al. Multipose audio-visual speech recognition , 2011, 2011 19th European Signal Processing Conference.