论文信息 - Applying Spectral Normalisation and Efficient Envelope Estimation and Statistical Transformation for the Voice Conversion Challenge 2016 - 字舞流文

Applying Spectral Normalisation and Efficient Envelope Estimation and Statistical Transformation for the Voice Conversion Challenge 2016

Comunicacio presentada a l'Interspeech 2016, celebrat els dies 8 a 12 de setembre de 2016 a San Francisco, California.

Junichi Yamagishi | Jordi Bonada | Fernando Villavicencio | Felipe Espic | J. Yamagishi | J. Bonada | F. Villavicencio | Felipe Espic

[1] Haizhou Li,et al. Exemplar-Based Sparse Representation With Residual Compensation for Voice Conversion , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[2] Olivier Rosec,et al. Voice Conversion Using Dynamic Frequency Warping With Amplitude Scaling, for Parallel or Nonparallel Corpora , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[3] Levent M. Arslan,et al. Speaker transformation using sentence HMM based alignments and detailed prosody modification , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[4] Satoshi Imai,et al. Cepstral synthesis of Japanese from CV syllable parameters , 1980, ICASSP.

[5] Tomoki Toda,et al. Eigenvoice conversion based on Gaussian mixture model , 2006, INTERSPEECH.

[6] Tomoki Toda,et al. Statistical singing voice conversion with direct waveform modification based on the spectrum differential , 2014, INTERSPEECH.

[7] Zhizheng Wu,et al. Analysis of the Voice Conversion Challenge 2016 Evaluation Results , 2016, INTERSPEECH.

[8] Axel Röbel,et al. Improving Lpc Spectral Envelope Extraction Of Voiced Speech By True-Envelope Estimation , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[9] Yoshinori Sagisaka,et al. Acoustic characteristics of speaker individuality: Control and conversion , 1995, Speech Commun..

[10] Axel Röbel,et al. On cepstral and all-pole based spectral envelope modeling with unknown model order , 2007, Pattern Recognit. Lett..

[11] Bayya Yegnanarayana,et al. Transformation of formants for voice conversion using artificial neural networks , 1995, Speech Commun..

[12] Alexander Kain,et al. Spectral voice conversion for text-to-speech synthesis , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[13] X. Rodet. EFFICIENT SPECTRAL ENVELOPE ESTIMATION AND ITS APPLICATION TO PITCH SHIFTING AND ENVELOPE PRESERVATION , 2005 .

[14] Li-Rong Dai,et al. Voice Conversion Using Deep Neural Networks With Layer-Wise Generative Training , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[15] Athanasios Mouchtaris,et al. Nonparallel training for voice conversion based on a parameter adaptation approach , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[16] L.C. Schwardt,et al. Voice conversion based on static speaker characteristics , 1998, Proceedings of the 1998 South African Symposium on Communications and Signal Processing-COMSIG '98 (Cat. No. 98EX214).

[17] B. Yegnanarayana,et al. Voice conversion: Factors responsible for quality , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[18] Amro El-Jaroudi,et al. Discrete all-pole modeling , 1991, IEEE Trans. Signal Process..

[19] Daniel Erro,et al. Voice Conversion Based on Weighted Frequency Warping , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[20] Fernando Villavicencio,et al. GMM-PCA based speaker-timbre conversion on full-quality speech , 2010, SSW.

[21] Axel Röbel,et al. Applying improved spectral modeling for High Quality voice conversion , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[22] Eric Moulines,et al. Voice transformation using PSOLA technique , 1991, Speech Commun..

[23] Yannis Agiomyrgiannakis,et al. Voice Morphing that improves TTS quality using an optimal dynamic frequency warping-and-weighting transform , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[24] Dae Hee Youn,et al. A new voice transformation method based on both linear and nonlinear prediction analysis , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[25] Jordi Bonada. WIDE-BAND HARMONIC SINUSOIDAL MODELING , 2008 .

[26] Yung-Hwan Oh,et al. Hidden Markov model based voice conversion using dynamic characteristics of speaker , 1997, EUROSPEECH.

[27] Satoshi Nakamura,et al. Voice conversion through vector quantization , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[28] Werner Verhelst,et al. Voice conversion using partitions of spectral feature space , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[29] Tomoki Toda,et al. The Voice Conversion Challenge 2016 , 2016, INTERSPEECH.

[30] Norio Higuchi,et al. Training data selection for voice conversion using speaker selection and vector field smoothing , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[31] Tomoki Toda,et al. Voice Conversion Based on Maximum-Likelihood Estimation of Spectral Parameter Trajectory , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[32] Eric Moulines,et al. Continuous probabilistic transform for voice conversion , 1998, IEEE Trans. Speech Audio Process..

[33] Jordi Bonada,et al. Applying voice conversion to concatenative singing-voice synthesis , 2010, INTERSPEECH.

[34] Jordi Bonada,et al. Observation-model error compensation for enhanced spectral envelope transformation in voice conversion , 2015, 2015 IEEE 25th International Workshop on Machine Learning for Signal Processing (MLSP).