暂无分享,去创建一个
Shin'ichi Satoh | Roger Zimmermann | Rajiv Ratn Shah | Yaman Kumar | Mayank Aggarwal | Pratham Nawal
[1] Liangliang Cao,et al. Lip2Audspec: Speech Reconstruction from Silent Lip Movements Video , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[2] M RezaAli,et al. Realization of the Contrast Limited Adaptive Histogram Equalization (CLAHE) for Real-Time Image Enhancement , 2004 .
[3] Samuel Pachoud,et al. Macro-cuboïd based probabilistic matching for lip-reading digits , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.
[4] Juergen Luettin,et al. Audio-Visual Automatic Speech Recognition: An Overview , 2004 .
[5] Yannis Stylianou,et al. Applying the harmonic plus noise model in concatenative speech synthesis , 2001, IEEE Trans. Speech Audio Process..
[6] John G. Beerends,et al. A Perceptual Audio Quality Measure Based on a Psychoacoustic Sound Representation , 1992 .
[7] Mohammed Bennamoun,et al. Listening with Your Eyes: Towards a Practical Visual Speech Recognition System Using Deep Boltzmann Machines , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[8] F. Itakura. Line spectrum representation of linear predictor coefficients of speech signals , 1975 .
[9] Jon Barker,et al. An audio-visual corpus for speech perception and automatic speech recognition. , 2006, The Journal of the Acoustical Society of America.
[10] Gerasimos Potamianos,et al. Lipreading Using Profile Versus Frontal Views , 2006, 2006 IEEE Workshop on Multimedia Signal Processing.
[11] Sridha Sridharan,et al. Continuous pose-invariant lipreading , 2008, INTERSPEECH.
[12] Ben P. Milner,et al. Reconstructing intelligible audio speech from visual speech features , 2015, INTERSPEECH.
[13] C. Benoît,et al. A set of French visemes for visual speech synthesis , 1994 .
[14] Matti Pietikäinen,et al. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON MULTIMEDIA 1 Lipreading with Local Spatiotemporal Descriptors , 2022 .
[15] Mostafa Mehdipour-Ghazi,et al. Visual Speech Recognition Using PCA Networks and LSTMs in a Tandem GMM-HMM System , 2016, ACCV Workshops.
[16] Tsuhan Chen,et al. Audio-visual integration in multimodal communication , 1998, Proc. IEEE.
[17] Kee-Eung Kim,et al. Multi-view Automatic Lip-Reading Using Neural Network , 2016, ACCV Workshops.
[18] Andries P. Hekstra,et al. Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[19] Barry-John Theobald,et al. View Independent Computer Lip-Reading , 2012, 2012 IEEE International Conference on Multimedia and Expo.
[20] Tsuhan Chen,et al. Profile View Lip Reading , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.
[21] Shimon Whiteson,et al. LipNet: End-to-End Sentence-level Lipreading , 2016, 1611.01599.
[22] Juhan Nam,et al. Multimodal Deep Learning , 2011, ICML.
[23] Sadaoki Furui,et al. Audio-Visual Speech Recognition Using Lip Information Extracted from Side-Face Images , 2007, EURASIP J. Audio Speech Music. Process..
[24] Matti Pietikäinen,et al. A review of recent advances in visual speech decoding , 2014, Image Vis. Comput..
[25] Thomas F. Quatieri,et al. Speech analysis/Synthesis based on a sinusoidal representation , 1986, IEEE Trans. Acoust. Speech Signal Process..
[26] Maja Pantic,et al. End-to-End Multi-View Lipreading , 2017, BMVC.
[27] Barry-John Theobald,et al. Improving visual features for lip-reading , 2010, AVSP.
[28] Birger Kollmeier,et al. PEMO-Q—A New Method for Objective Audio Quality Assessment Using a Model of Auditory Perception , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[29] Shmuel Peleg,et al. Improved Speech Reconstruction from Silent Video , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).
[30] G. Fant. Acoustic theory of speech production : with calculations based on X-ray studies of Russian articulations , 1961 .
[31] David Taylor. Hearing by Eye: The Psychology of Lip-Reading , 1988 .
[32] Matti Pietikäinen,et al. OuluVS2: A multi-view audiovisual database for non-rigid mouth motion analysis , 2015, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).
[33] Tetsuya Ogata,et al. Audio-visual speech recognition using deep learning , 2014, Applied Intelligence.
[34] Eric David Petajan,et al. Automatic Lipreading to Enhance Speech Recognition (Speech Reading) , 1984 .
[35] Hongbin Zha,et al. Unsupervised Random Forest Manifold Alignment for Lipreading , 2013, 2013 IEEE International Conference on Computer Vision.
[36] Maja Pantic,et al. Deep complementary bottleneck features for visual speech recognition , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[37] Walid Mahdi,et al. A New Visual Speech Recognition Approach for RGB-D Cameras , 2014, ICIAR.
[38] Richard M. Stern,et al. Signal and Feature Compensa-tion Methods for Robust Speech Recognition , 2002 .
[40] K. Krigger. Cerebral palsy: an overview. , 2006, American family physician.
[41] Jürgen Schmidhuber,et al. Lipreading with long short-term memory , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[42] Thomas Sporer,et al. PEAQ - The ITU Standard for Objective Measurement of Perceived Audio Quality , 2000 .
[43] Ali M. Reza,et al. Realization of the Contrast Limited Adaptive Histogram Equalization (CLAHE) for Real-Time Image Enhancement , 2004, J. VLSI Signal Process..
[44] Matti Pietikäinen,et al. Towards a practical lipreading system , 2011, CVPR 2011.
[45] Victor Zue,et al. Speech database development at MIT: Timit and beyond , 1990, Speech Commun..
[46] Q. Summerfield,et al. Lipreading and audio-visual speech perception. , 1992, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.
[47] Joon Son Chung,et al. Lip Reading in the Wild , 2016, ACCV.
[48] Richard Bowden,et al. Learning Sequential Patterns for Lipreading , 2011, BMVC.
[49] Walid Mahdi,et al. Unified System for Visual Speech Recognition and Speaker Identification , 2015, ACIVS.