Reconstructing intelligible audio speech from visual speech features
暂无分享,去创建一个
[1] Juergen Luettin,et al. Audio-Visual Speech Modeling for Continuous Speech Recognition , 2000, IEEE Trans. Multim..
[2] Ben P. Milner,et al. Analysis of correlation between audio and visual speech features for clean audio feature prediction in noise , 2006, INTERSPEECH.
[3] Hervé Glotin,et al. Large-vocabulary audio-visual speech recognition: a summary of the Johns Hopkins Summer 2000 Workshop , 2001, 2001 IEEE Fourth Workshop on Multimedia Signal Processing (Cat. No.01TH8564).
[4] Sophie M. Wuerger,et al. Continuous audio-visual digit recognition using N-best decision fusion , 2004, Inf. Fusion.
[5] Jon Barker,et al. An audio-visual corpus for speech perception and automatic speech recognition. , 2006, The Journal of the Acoustical Society of America.
[6] Juergen Luettin,et al. Audio-Visual Automatic Speech Recognition: An Overview , 2004 .
[7] Yannis Stylianou,et al. Applying the harmonic plus noise model in concatenative speech synthesis , 2001, IEEE Trans. Speech Audio Process..
[8] Timothy F. Cootes,et al. Active Appearance Models , 2001, IEEE Trans. Pattern Anal. Mach. Intell..
[9] Tara N. Sainath,et al. Improving deep neural networks for LVCSR using rectified linear units and dropout , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[10] Jean-Philippe Thiran,et al. On Dynamic Stream Weighting for Audio-Visual Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[11] Heiga Zen,et al. Speaker-Independent HMM-based Speech Synthesis System , 2007 .
[12] Hideki Kawahara,et al. Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds , 1999, Speech Commun..
[13] Thomas F. Quatieri,et al. Speech analysis/Synthesis based on a sinusoidal representation , 1986, IEEE Trans. Acoust. Speech Signal Process..
[14] Ben P. Milner,et al. Using audio-visual features for robust voice activity detection in clean and noisy speech , 2008, 2008 16th European Signal Processing Conference.
[15] Ben P. Milner,et al. Visually Derived Wiener Filters for Speech Enhancement , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[16] Paul Boersma,et al. Praat, a system for doing phonetics by computer , 2002 .
[17] Jonas Beskow,et al. Animated Lombard speech: Motion capture, facial animation and visual intelligibility of speech produced in adverse conditions , 2014, Comput. Speech Lang..
[18] J L Schwartz,et al. Audio-visual enhancement of speech in noise. , 2001, The Journal of the Acoustical Society of America.
[19] Martin A. Riedmiller,et al. A direct adaptive method for faster backpropagation learning: the RPROP algorithm , 1993, IEEE International Conference on Neural Networks.
[20] Faheem Khan,et al. Speaker separation using visual speech features and single-channel audio , 2013, INTERSPEECH.
[21] Khalid Sayood,et al. Introduction to Data Compression , 1996 .
[22] Jon Barker,et al. Evidence of correlation between acoustic and visual features of speech , 1999 .
[23] Hani Yehia,et al. Quantitative association of vocal-tract and facial behavior , 1998, Speech Commun..
[24] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.
[25] P. Mermelstein. Articulatory model for the study of speech production. , 1973, The Journal of the Acoustical Society of America.
[26] Q. Summerfield,et al. Lipreading and audio-visual speech perception. , 1992, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.
[27] Philip J. B. Jackson,et al. Audio-visual Convolutive Blind Source Separation , 2010 .