Scope for Deep Learning: A Study in Audio-Visual Speech Recognition
暂无分享,去创建一个
[1] Jianwu Dang,et al. Audio-visual speech recognition integrating 3D lip information obtained from the Kinect , 2016, Multimedia Systems.
[2] Petros Maragos,et al. Adaptive Multimodal Fusion by Uncertainty Compensation With Application to Audiovisual Speech Recognition , 2009, IEEE Transactions on Audio, Speech, and Language Processing.
[3] Alice Caplier,et al. Lip contour segmentation and tracking compliant with lip-reading application constraints , 2012, Machine Vision and Applications.
[4] Dinesh Kant Kumar,et al. Automatic visual speech segmentation and recognition using directional motion history images and Zernike moments , 2013, The Visual Computer.
[5] Tetsuya Ogata,et al. Audio-visual speech recognition using deep learning , 2014, Applied Intelligence.
[6] Farshad Almasganj,et al. Audio-visual feature fusion via deep neural networks for automatic speech recognition , 2018, Digit. Signal Process..
[7] Sushila Maheshkar,et al. Visual Speech Recognition Using Optical Flow and Hidden Markov Model , 2019, Wirel. Pers. Commun..
[8] Tolga Çiloglu,et al. Bimodal automatic speech segmentation based on audio and visual information fusion , 2011, Speech Commun..
[9] Timothy J. Hazen. Visual model structures and synchrony constraints for audio-visual speech recognition , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[10] S. Palanivel,et al. Lip reading of hearing impaired persons using HMM , 2011, Expert Syst. Appl..
[11] Mahesh Chandra,et al. Multiple camera in car audio-visual speech recognition using phonetic and visemic information , 2015, Comput. Electr. Eng..
[12] Ahmed Hussen Abdelaziz. Comparing Fusion Models for DNN-Based Audiovisual Continuous Speech Recognition , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[13] Hazem M. Abbas,et al. Improved features and dynamic stream weight adaption for robust Audio-Visual Speech Recognition framework , 2019, Digit. Signal Process..
[14] Timothy F. Cootes,et al. Extraction of Visual Features for Lipreading , 2002, IEEE Trans. Pattern Anal. Mach. Intell..
[15] Andrzej Czyzewski,et al. An audio-visual corpus for multimodal automatic speech recognition , 2017, Journal of Intelligent Information Systems.
[16] Jiri Matas,et al. XM2VTSDB: The Extended M2VTS Database , 1999 .
[17] Lucas D. Terissi,et al. Robust front-end for audio, visual and audio–visual speech classification , 2018, Int. J. Speech Technol..
[18] Jean-Philippe Thiran,et al. On Dynamic Stream Weighting for Audio-Visual Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[19] Matti Pietikäinen,et al. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. IEEE TRANSACTIONS ON MULTIMEDIA 1 Lipreading with Local Spatiotemporal Descriptors , 2022 .
[20] Jean-Philippe Thiran,et al. Multi-pose lipreading and audio-visual speech recognition , 2012, EURASIP J. Adv. Signal Process..
[21] Joon Son Chung,et al. Lip Reading in the Wild , 2016, ACCV.
[22] Juergen Luettin,et al. Audio-Visual Speech Modelling for Continuous Speech Recognition , 2000 .
[23] Paul Mineiro,et al. Robust Sensor Fusion: Analysis and Application to Audio Visual Speech Recognition , 1998, Machine Learning.
[24] J.N. Gowdy,et al. CUAVE: A new audio-visual database for multimodal human-computer interface research , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[25] Suprava Patnaik,et al. A novel lip reading algorithm by using localized ACM and HMM: Tested for digit recognition , 2014 .
[26] Joon Son Chung,et al. Lip Reading in Profile , 2017, BMVC.
[27] Maja Pantic,et al. Deep complementary bottleneck features for visual speech recognition , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[28] Andrzej Czyzewski,et al. A comparative study of English viseme recognition methods and algorithms , 2017, Multimedia Tools and Applications.
[29] Darryl Stewart,et al. Robust Audio-Visual Speech Recognition Under Noisy Audio-Video Conditions , 2014, IEEE Transactions on Cybernetics.
[30] Stephen J. Cox,et al. The challenge of multispeaker lip-reading , 2008, AVSP.
[31] Sadaoki Furui,et al. Multi-Modal Speech Recognition Using Optical-Flow Analysis for Lip Images , 2004, J. VLSI Signal Process..
[32] Naomi Harte,et al. TCD-TIMIT: An Audio-Visual Corpus of Continuous Speech , 2015, IEEE Transactions on Multimedia.
[33] M. Z. Ibrahim,et al. Geometrical-based lip-reading using template probabilistic multi-dimension dynamic time warping , 2015, J. Vis. Commun. Image Represent..
[34] Joon Son Chung,et al. Lip Reading Sentences in the Wild , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).