F0 transformation techniques for statistical voice conversion with direct waveform modification with spectral differential
暂无分享,去创建一个
Tomoki Toda | Satoshi Nakamura | Kazuhiro Kobayashi | T. Toda | Satoshi Nakamura | Kazuhiro Kobayashi
[1] Tomoki Toda,et al. Maximum likelihood voice conversion based on GMM with STRAIGHT mixed excitation , 2006, INTERSPEECH.
[2] Hideki Kawahara,et al. Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds , 1999, Speech Commun..
[3] Satoshi Nakamura,et al. Voice conversion through vector quantization , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.
[4] Yannis Stylianou,et al. Applying the harmonic plus noise model in concatenative speech synthesis , 2001, IEEE Trans. Speech Audio Process..
[5] Tetsuya Takiguchi,et al. Exemplar-Based Voice Conversion Using Sparse Representation in Noisy Environments , 2013, IEICE Trans. Fundam. Electron. Commun. Comput. Sci..
[6] Haizhou Li,et al. Exemplar-Based Sparse Representation With Residual Compensation for Voice Conversion , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[7] Banno Hideki,et al. GMM voice conversion of singing voice using vocal tract area function , 2010 .
[8] Keiichi Tokuda,et al. Mel-generalized cepstral analysis - a unified approach to speech spectral estimation , 1994, ICSLP.
[9] John H. L. Hansen,et al. Speaker-specific pitch contour modeling and modification , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[10] Tomoki Toda,et al. The Voice Conversion Challenge 2016 , 2016, INTERSPEECH.
[11] Keiichi Tokuda,et al. Speech parameter generation algorithms for HMM-based speech synthesis , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).
[12] Tomoki Toda,et al. Implementation of Computationally Efficient Real-Time Voice Conversion , 2012, INTERSPEECH.
[13] Heiga Zen,et al. Gaussian Process Experts for Voice Conversion , 2011, INTERSPEECH.
[14] Tomoki Toda,et al. Statistical singing voice conversion with direct waveform modification based on the spectrum differential , 2014, INTERSPEECH.
[15] Inma Hernáez,et al. Harmonics Plus Noise Model Based Vocoder for Statistical Parametric Speech Synthesis , 2014, IEEE Journal of Selected Topics in Signal Processing.
[16] Kun Li,et al. Voice conversion using deep Bidirectional Long Short-Term Memory based Recurrent Neural Networks , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[17] Tomoki Toda,et al. Augmented speech production based on real-time statistical voice conversion , 2014, 2014 IEEE Global Conference on Signal and Information Processing (GlobalSIP).
[18] Tomoki Toda,et al. Voice Conversion Based on Maximum-Likelihood Estimation of Spectral Parameter Trajectory , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[19] Tomoki Toda,et al. Statistical singing voice conversion based on direct waveform modification with global variance , 2015, INTERSPEECH.
[20] Ning Xu,et al. Voice conversion based on Gaussian processes by coherent and asymmetric training with limited training data , 2014, Speech Commun..
[21] Zhizheng Wu,et al. Analysis of the Voice Conversion Challenge 2016 Evaluation Results , 2016, INTERSPEECH.
[22] Werner Verhelst,et al. An overlap-add technique based on waveform similarity (WSOLA) for high quality time-scale modification of speech , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[23] Li-Rong Dai,et al. Voice Conversion Using Deep Neural Networks With Layer-Wise Generative Training , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[24] Tomoki Toda,et al. The NU-NAIST Voice Conversion System for the Voice Conversion Challenge 2016 , 2016, INTERSPEECH.
[25] Tomoki Toda,et al. Implementation of F0 transformation for statistical singing voice conversion based on direct waveform modification , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[26] Hideki Kawahara,et al. Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system STRAIGHT , 2001, MAVEBA.
[27] Tomoki Toda,et al. Postfilters to Modify the Modulation Spectrum for Statistical Parametric Speech Synthesis , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[28] Eric Moulines,et al. Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones , 1989, Speech Commun..
[29] Eric Moulines,et al. Continuous probabilistic transform for voice conversion , 1998, IEEE Trans. Speech Audio Process..