Visual-to-speech conversion based on maximum likelihood estimation
暂无分享,去创建一个
Tetsuya Takiguchi | Yasuo Ariki | Rina Ra | Ryo Aihara | T. Takiguchi | Y. Ariki | Ryo Aihara | Rina Ra
[1] Keiichi Tokuda,et al. HMM-based text-to-audio-visual speech synthesis , 2000, INTERSPEECH.
[2] Satoshi Nakamura,et al. Lip movement synthesis from speech based on hidden Markov models , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.
[3] Li-Rong Dai,et al. Joint spectral distribution modeling using restricted boltzmann machines for voice conversion , 2013, INTERSPEECH.
[4] F. Lavagetto,et al. Converting speech into lip movements: a multimedia telephone for hard of hearing people , 1995 .
[5] Frank K. Soong,et al. A minimum converted trajectory error (MCTE) approach to high quality speech-to-lips conversion , 2010, INTERSPEECH.
[6] Tetsuya Takiguchi,et al. Multiple Non-Negative Matrix Factorization for Many-to-Many Voice Conversion , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[7] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[8] Tomoki Toda,et al. Speaking-aid systems using GMM-based voice conversion for electrolaryngeal speech , 2012, Speech Commun..
[9] Tomoki Toda,et al. Voice Conversion Based on Maximum-Likelihood Estimation of Spectral Parameter Trajectory , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[10] H. McGurk,et al. Hearing lips and seeing voices , 1976, Nature.
[11] T. Takiguchi,et al. LIP-TO-SPEECH SYNTHESIS USING LOCALITY-CONSTRAINT NON-NEGATIVE MATRIX FACTORIZATION , 2015 .
[12] Eric Moulines,et al. Continuous probabilistic transform for voice conversion , 1998, IEEE Trans. Speech Audio Process..
[13] Shimon Whiteson,et al. LipNet: Sentence-level Lipreading , 2016, ArXiv.
[14] Hideki Kawahara,et al. STRAIGHT, exploitation of the other aspect of VOCODER: Perceptually isomorphic decomposition of speech sounds , 2006 .