Voice conversion with SI-DNN and KL divergence based mapping without parallel training data
暂无分享,去创建一个
[1] Athanasios Mouchtaris,et al. Nonparallel training for voice conversion based on a parameter adaptation approach , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[2] Xia Wang,et al. Supervisory Data Alignment for Text-Independent Voice Conversion , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[3] Janet M. Baker,et al. The Design for the Wall Street Journal-based CSR Corpus , 1992, HLT.
[4] Satoshi Nakamura,et al. Voice conversion through vector quantization , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.
[5] R. A. Leibler,et al. On Information and Sufficiency , 1951 .
[6] Ahmad Akbari,et al. A new wavelet thresholding method for speech enhancement based on symmetric Kullback-Leibler divergence , 2009, 2009 14th International CSI Computer Conference.
[7] Keiichi Tokuda,et al. Speech parameter generation algorithms for HMM-based speech synthesis , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).
[8] Tara N. Sainath,et al. FUNDAMENTAL TECHNOLOGIES IN MODERN SPEECH RECOGNITION Digital Object Identifier 10.1109/MSP.2012.2205597 , 2012 .
[9] Haifeng Li,et al. Sequence error (SE) minimization training of neural network for voice conversion , 2014, INTERSPEECH.
[10] Haizhou Li,et al. Exemplar-Based Sparse Representation With Residual Compensation for Voice Conversion , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[11] Frank K. Soong,et al. Measuring attribute dissimilarity with HMM KL-divergence for speech synthesis , 2007, SSW.
[12] Tomoki Toda,et al. The NU-NAIST Voice Conversion System for the Voice Conversion Challenge 2016 , 2016, INTERSPEECH.
[13] Tomoki Toda,et al. The Voice Conversion Challenge 2016 , 2016, INTERSPEECH.
[14] Frank K. Soong,et al. A Cross-Language State Sharing and Mapping Approach to Bilingual (Mandarin–English) TTS , 2009, IEEE Transactions on Audio, Speech, and Language Processing.
[15] Li-Rong Dai,et al. Minimum Kullback–Leibler Divergence Parameter Generation for HMM-Based Speech Synthesis , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[16] Jun Du,et al. A New Minimum Divergence Approach to Discriminative Training , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.
[17] Alexander Kain,et al. Spectral voice conversion for text-to-speech synthesis , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[18] Hao Wang,et al. Phonetic posteriorgrams for many-to-one voice conversion without parallel data training , 2016, 2016 IEEE International Conference on Multimedia and Expo (ICME).
[19] Chung-Hsien Wu,et al. Map-based adaptation for speech conversion using adaptation data selection and non-parallel training , 2006, INTERSPEECH.
[20] Athanasios Mouchtaris,et al. A Spectral Conversion Approach to Single-Channel Speech Enhancement , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[21] Li-Rong Dai,et al. Joint spectral distribution modeling using restricted boltzmann machines for voice conversion , 2013, INTERSPEECH.
[22] Kishore Prahallad,et al. Spectral Mapping Using Artificial Neural Networks for Voice Conversion , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[23] Frank K. Soong,et al. Optimal clustering of multivariate normal distributions using divergence and its application to HMM adaptation , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..
[24] Hermann Ney,et al. Text-Independent Voice Conversion Based on Unit Selection , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.
[25] Li-Rong Dai,et al. Voice Conversion Using Deep Neural Networks With Layer-Wise Generative Training , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[26] Tomoki Toda,et al. Voice Conversion Based on Maximum-Likelihood Estimation of Spectral Parameter Trajectory , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[27] Seyed Hamidreza Mohammadi,et al. An overview of voice conversion systems , 2017, Speech Commun..
[28] Yiming Wang,et al. Low Latency Acoustic Modeling Using Temporal Convolution and LSTMs , 2018, IEEE Signal Processing Letters.
[29] Kun Li,et al. Voice conversion using deep Bidirectional Long Short-Term Memory based Recurrent Neural Networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[30] Biing-Hwang Juang,et al. Line spectrum pair (LSP) and speech data compression , 1984, ICASSP.
[31] Bayya Yegnanarayana,et al. Transformation of formants for voice conversion using artificial neural networks , 1995, Speech Commun..
[32] Hermann Ney,et al. A first step towards text-independent voice conversion , 2004, INTERSPEECH.
[33] Haizhou Li,et al. Conditional restricted Boltzmann machine for voice conversion , 2013, 2013 IEEE China Summit and International Conference on Signal and Information Processing.
[34] Simon King,et al. Modelling the uncertainty in recovering articulation from acoustics , 2003, Comput. Speech Lang..
[35] Haizhou Li,et al. Exemplar-based sparse representation of timbre and prosody for voice conversion , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[36] Eric Moulines,et al. Voice transformation using PSOLA technique , 1991, Speech Commun..
[37] Alex Acero,et al. Robust bandwidth extension of noise-corrupted narrowband speech , 2005, INTERSPEECH.
[38] Daniel Erro,et al. INCA Algorithm for Training Voice Conversion Systems From Nonparallel Corpora , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[39] Haifeng Li,et al. A KL divergence and DNN approach to cross-lingual TTS , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[40] Frank K. Soong,et al. A new DNN-based high quality pronunciation evaluation for computer-aided language learning (CALL) , 2013, INTERSPEECH.
[41] Kaisheng Yao,et al. KL-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[42] Eric Moulines,et al. Continuous probabilistic transform for voice conversion , 1998, IEEE Trans. Speech Audio Process..
[43] Hyung Soon Kim,et al. Narrowband to wideband conversion of speech using GMM based transformation , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).
[44] Koby Crammer,et al. Non-parallel voice conversion using joint optimization of alignment by temporal context and spectral distortion , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[45] Keiichi Tokuda,et al. Statistical mapping between articulatory movements and acoustic spectrum using a Gaussian mixture model , 2008, Speech Commun..
[46] Tomoki Toda,et al. Voice conversion for various types of body transmitted speech , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.