Hush-Hush Speak: Speech Reconstruction Using Silent Videos
暂无分享,去创建一个
Rajiv Ratn Shah | Debanjan Mahata | Amanda Stent | Yaman Kumar | Dhruva Sahrawat | Shashwat Uttam | Mansi Aggarwal | Yaman Kumar Singla | Amanda Stent | R. Shah | Mansi Aggarwal | Debanjan Mahata | Dhruva Sahrawat | Shashwat Uttam
[1] Chalapathy Neti,et al. Translingual visual speech synthesis , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).
[2] Liangliang Cao,et al. Lip2Audspec: Speech Reconstruction from Silent Lip Movements Video , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[3] Shmuel Peleg,et al. Vid2speech: Speech reconstruction from silent video , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[4] Shmuel Peleg,et al. Improved Speech Reconstruction from Silent Video , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).
[5] Swati Aggarwal,et al. Lipper: Speaker Independent Speech Synthesis Using Multi-View Lipreading , 2019, AAAI.
[6] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[7] Andries P. Hekstra,et al. Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[8] Matti Pietikäinen,et al. OuluVS2: A multi-view audiovisual database for non-rigid mouth motion analysis , 2015, 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG).
[9] Rohit Jain,et al. Lipper: Synthesizing Thy Speech using Multi-View Lipreading , 2019, AAAI.
[10] Andrew Owens,et al. Visually Indicated Sounds , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[11] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[12] Jonathan G. Fiscus,et al. Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .
[13] Rohit Jain,et al. MyLipper: A Personalized System for Speech Reconstruction using Multi-view Visual Feeds , 2018, 2018 IEEE International Symposium on Multimedia (ISM).
[14] Shin'ichi Satoh,et al. Harnessing AI for Speech Reconstruction using Multi-view Silent Video Feed , 2018, ACM Multimedia.
[15] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.