Voice Conversion Using RNN Pre-Trained by Recurrent Temporal Restricted Boltzmann Machines
暂无分享,去创建一个
[1] B. Schölkopf,et al. Modeling Human Motion Using Binary Latent Variables , 2007 .
[2] Hermann Ney,et al. A Deep Learning Approach to Machine Transliteration , 2009, WMT@EACL.
[3] Tetsuya Takiguchi,et al. Voice conversion in high-order eigen space using deep belief nets , 2013, INTERSPEECH.
[4] Li-Rong Dai,et al. Minimum Kullback–Leibler Divergence Parameter Generation for HMM-Based Speech Synthesis , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[5] Li-Rong Dai,et al. Joint spectral distribution modeling using restricted boltzmann machines for voice conversion , 2013, INTERSPEECH.
[6] Tomoki Toda,et al. Speaking-aid systems using GMM-based voice conversion for electrolaryngeal speech , 2012, Speech Commun..
[7] Ren-Hua Wang,et al. USTC System for Blizzard Challenge 2006 an Improved HMM-based Speech Synthesis Method , 2006, Blizzard Challenge.
[8] Eric Moulines,et al. Voice transformation using PSOLA technique , 1991, Speech Commun..
[9] Hideki Kawahara,et al. Tandem-STRAIGHT: A temporally stable power spectral representation for periodic signals and applications to interference-free spectrum, F0, and aperiodicity estimation , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.
[10] Geoffrey E. Hinton,et al. Acoustic Modeling Using Deep Belief Networks , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[11] Jonathan Le Roux,et al. Discriminative Training for Large-Vocabulary Speech Recognition Using Minimum Classification Error , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[12] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[13] Geoffrey E. Hinton,et al. The Recurrent Temporal Restricted Boltzmann Machine , 2008, NIPS.
[14] Keikichi Hirose,et al. One-to-Many Voice Conversion Based on Tensor Representation of Speaker Space , 2011, INTERSPEECH.
[15] Jiao Licheng,et al. An Algorithm for SAR Image Embedded Compression Based on Wavelet Transform , 2007, Eighth ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing (SNPD 2007).
[16] Chung-Hsien Wu,et al. Map-based adaptation for speech conversion using adaptation data selection and non-parallel training , 2006, INTERSPEECH.
[17] Eric Moulines,et al. Continuous probabilistic transform for voice conversion , 1998, IEEE Trans. Speech Audio Process..
[18] Razvan Pascanu,et al. On the difficulty of training recurrent neural networks , 2012, ICML.
[19] Tetsuya Takiguchi,et al. Exemplar-based voice conversion in noisy environment , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).
[20] Xu Shao,et al. Speech reconstruction from mel-frequency cepstral coefficients using a source-filter model , 2002, INTERSPEECH.
[21] Zhen Yang,et al. Voice Conversion Using Canonical Correlation Analysis Based on Gaussian Mixture Model , 2007 .
[22] Tomoki Toda,et al. Eigenvoice conversion based on Gaussian mixture model , 2006, INTERSPEECH.
[23] Haizhou Li,et al. Exemplar-based voice conversion using non-negative spectrogram deconvolution , 2013, SSW.
[24] Kishore Prahallad,et al. Voice conversion using Artificial Neural Networks , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.
[25] Yoshua Bengio,et al. Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription , 2012, ICML.
[26] Satoshi Nakamura,et al. Voice conversion through vector quantization , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.
[27] Tapani Raiko,et al. Improved Learning of Gaussian-Bernoulli Restricted Boltzmann Machines , 2011, ICANN.
[28] Ren-Hua Wang,et al. Minimum segmentation error based discriminative training for speech synthesis application , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[29] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.
[30] Nobuaki Minematsu,et al. Probabilistic integration of joint density model and speaker model for voice conversion , 2010, INTERSPEECH.
[31] David Haussler,et al. Unsupervised learning of distributions on binary vectors using two layer networks , 1991, NIPS 1991.
[32] Li Deng,et al. High-performance robust speech recognition using stereo training data , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[33] Xavier Rodet,et al. Intonation Conversion from Neutral to Expressive Speech , 2011, INTERSPEECH.
[34] Alexander Kain,et al. Spectral voice conversion for text-to-speech synthesis , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[35] Geoffrey E. Hinton,et al. 3D Object Recognition with Deep Belief Nets , 2009, NIPS.
[36] Shigeru Katagiri,et al. ATR Japanese speech database as a tool of speech recognition and synthesis , 1990, Speech Commun..
[37] Keiichi Tokuda,et al. A Speech Parameter Generation Algorithm Considering Global Variance for HMM-Based Speech Synthesis , 2007, IEICE Trans. Inf. Syst..
[38] Tetsuya Takiguchi,et al. Voice conversion in time-invariant speaker-independent space , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[39] Tomoki Toda,et al. Voice Conversion Based on Maximum-Likelihood Estimation of Spectral Parameter Trajectory , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[40] Dong Yu,et al. Modeling Spectral Envelopes Using Restricted Boltzmann Machines and Deep Belief Networks for Statistical Parametric Speech Synthesis , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[41] Haizhou Li,et al. Conditional restricted Boltzmann machine for voice conversion , 2013, 2013 IEEE China Summit and International Conference on Signal and Information Processing.
[42] Paul Smolensky,et al. Information processing in dynamical systems: foundations of harmony theory , 1986 .
[43] Keikichi Hirose,et al. Speech generation from hand gestures based on space mapping , 2009, INTERSPEECH.