Exemplar-based Lip-to-Speech Synthesis Using Convolutional Neural Networks
暂无分享,去创建一个
[1] H. McGurk,et al. Hearing lips and seeing voices , 1976, Nature.
[2] Ashish Verma,et al. LATE INTEGRATION IN AUDIO-VISUAL CONTINUOUS SPEECH RECOGNITION , 1999 .
[3] H. Sebastian Seung,et al. Algorithms for Non-negative Matrix Factorization , 2000, NIPS.
[4] Jon Barker,et al. An audio-visual corpus for speech perception and automatic speech recognition. , 2006, The Journal of the Acoustical Society of America.
[5] Simon King,et al. Using HMM-based Speech Synthesis to Reconstruct the Voice of Individuals with Degenerative Speech Disorders , 2012, INTERSPEECH.
[6] Tetsuya Takiguchi,et al. Exemplar-based voice conversion in noisy environment , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).
[7] Tomoki Toda,et al. Speaking-aid systems using GMM-based voice conversion for electrolaryngeal speech , 2012, Speech Commun..
[8] Josef Chaloupka,et al. Audio-visual speech recognition in noisy audio environments , 2013, 2013 36th International Conference on Telecommunications and Signal Processing (TSP).
[9] Frédo Durand,et al. The visual microphone , 2014, ACM Trans. Graph..
[10] T. Takiguchi,et al. LIP-TO-SPEECH SYNTHESIS USING LOCALITY-CONSTRAINT NON-NEGATIVE MATRIX FACTORIZATION , 2015 .
[11] Joon Son Chung,et al. Lip Reading in the Wild , 2016, ACCV.
[12] Masanori Morise,et al. WORLD: A Vocoder-Based High-Quality Speech Synthesis System for Real-Time Applications , 2016, IEICE Trans. Inf. Syst..
[13] Jürgen Schmidhuber,et al. Improving Speaker-Independent Lipreading with Domain-Adversarial Training , 2017, INTERSPEECH.
[14] Liangliang Cao,et al. Lip2Audspec: Speech Reconstruction from Silent Lip Movements Video , 2017, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).