Applying Spectral Normalisation and Efficient Envelope Estimation and Statistical Transformation for the Voice Conversion Challenge 2016

Comunicacio presentada a l'Interspeech 2016, celebrat els dies 8 a 12 de setembre de 2016 a San Francisco, California.

[1]  Haizhou Li,et al.  Exemplar-Based Sparse Representation With Residual Compensation for Voice Conversion , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[2]  Olivier Rosec,et al.  Voice Conversion Using Dynamic Frequency Warping With Amplitude Scaling, for Parallel or Nonparallel Corpora , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[3]  Levent M. Arslan,et al.  Speaker transformation using sentence HMM based alignments and detailed prosody modification , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[4]  Satoshi Imai,et al.  Cepstral synthesis of Japanese from CV syllable parameters , 1980, ICASSP.

[5]  Tomoki Toda,et al.  Eigenvoice conversion based on Gaussian mixture model , 2006, INTERSPEECH.

[6]  Tomoki Toda,et al.  Statistical singing voice conversion with direct waveform modification based on the spectrum differential , 2014, INTERSPEECH.

[7]  Zhizheng Wu,et al.  Analysis of the Voice Conversion Challenge 2016 Evaluation Results , 2016, INTERSPEECH.

[8]  Axel Röbel,et al.  Improving Lpc Spectral Envelope Extraction Of Voiced Speech By True-Envelope Estimation , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[9]  Yoshinori Sagisaka,et al.  Acoustic characteristics of speaker individuality: Control and conversion , 1995, Speech Commun..

[10]  Axel Röbel,et al.  On cepstral and all-pole based spectral envelope modeling with unknown model order , 2007, Pattern Recognit. Lett..

[11]  Bayya Yegnanarayana,et al.  Transformation of formants for voice conversion using artificial neural networks , 1995, Speech Commun..

[12]  Alexander Kain,et al.  Spectral voice conversion for text-to-speech synthesis , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[13]  X. Rodet EFFICIENT SPECTRAL ENVELOPE ESTIMATION AND ITS APPLICATION TO PITCH SHIFTING AND ENVELOPE PRESERVATION , 2005 .

[14]  Li-Rong Dai,et al.  Voice Conversion Using Deep Neural Networks With Layer-Wise Generative Training , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[15]  Athanasios Mouchtaris,et al.  Nonparallel training for voice conversion based on a parameter adaptation approach , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[16]  L.C. Schwardt,et al.  Voice conversion based on static speaker characteristics , 1998, Proceedings of the 1998 South African Symposium on Communications and Signal Processing-COMSIG '98 (Cat. No. 98EX214).

[17]  B. Yegnanarayana,et al.  Voice conversion: Factors responsible for quality , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[18]  Amro El-Jaroudi,et al.  Discrete all-pole modeling , 1991, IEEE Trans. Signal Process..

[19]  Daniel Erro,et al.  Voice Conversion Based on Weighted Frequency Warping , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[20]  Fernando Villavicencio,et al.  GMM-PCA based speaker-timbre conversion on full-quality speech , 2010, SSW.

[21]  Axel Röbel,et al.  Applying improved spectral modeling for High Quality voice conversion , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[22]  Eric Moulines,et al.  Voice transformation using PSOLA technique , 1991, Speech Commun..

[23]  Yannis Agiomyrgiannakis,et al.  Voice Morphing that improves TTS quality using an optimal dynamic frequency warping-and-weighting transform , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[24]  Dae Hee Youn,et al.  A new voice transformation method based on both linear and nonlinear prediction analysis , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[25]  Jordi Bonada WIDE-BAND HARMONIC SINUSOIDAL MODELING , 2008 .

[26]  Yung-Hwan Oh,et al.  Hidden Markov model based voice conversion using dynamic characteristics of speaker , 1997, EUROSPEECH.

[27]  Satoshi Nakamura,et al.  Voice conversion through vector quantization , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[28]  Werner Verhelst,et al.  Voice conversion using partitions of spectral feature space , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[29]  Tomoki Toda,et al.  The Voice Conversion Challenge 2016 , 2016, INTERSPEECH.

[30]  Norio Higuchi,et al.  Training data selection for voice conversion using speaker selection and vector field smoothing , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[31]  Tomoki Toda,et al.  Voice Conversion Based on Maximum-Likelihood Estimation of Spectral Parameter Trajectory , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[32]  Eric Moulines,et al.  Continuous probabilistic transform for voice conversion , 1998, IEEE Trans. Speech Audio Process..

[33]  Jordi Bonada,et al.  Applying voice conversion to concatenative singing-voice synthesis , 2010, INTERSPEECH.

[34]  Jordi Bonada,et al.  Observation-model error compensation for enhanced spectral envelope transformation in voice conversion , 2015, 2015 IEEE 25th International Workshop on Machine Learning for Signal Processing (MLSP).