Transformation of prosody in voice conversion
暂无分享,去创建一个
[1] Haizhou Li,et al. Exemplar-based sparse representation of timbre and prosody for voice conversion , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[2] R. Srikanth,et al. Duration modelling in voice conversion using artificial neural networks , 2012, 2012 19th International Conference on Systems, Signals and Image Processing (IWSSIP).
[3] Yu Tsao,et al. Locally Linear Embedding for Exemplar-Based Spectral Conversion , 2016, INTERSPEECH.
[4] Moncef Gabbouj,et al. Hierarchical modeling of F0 contours for voice conversion , 2014, INTERSPEECH.
[5] Chng Eng Siong,et al. Correlation-based frequency warping for voice conversion , 2014, The 9th International Symposium on Chinese Spoken Language Processing.
[6] Haizhou Li,et al. Exemplar-based voice conversion using non-negative spectrogram deconvolution , 2013, SSW.
[7] Yi Xu. SPEECH PROSODY : A METHODOLOGICAL REVIEW , 2011 .
[8] Satoshi Nakamura,et al. Voice conversion through vector quantization , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.
[9] Bin Ma,et al. Spoken Language Recognition: From Fundamentals to Practice , 2013, Proceedings of the IEEE.
[10] Tetsuya Takiguchi,et al. Parallel Dictionary Learning for Multimodal Voice Conversion Using Matrix Factorization , 2016 .
[11] Stephen DiVerdi,et al. Cute: A concatenative method for voice conversion using exemplar-based unit selection , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[12] Haizhou Li,et al. An overview of text-independent speaker recognition: From features to supervectors , 2010, Speech Commun..
[13] Zhizheng Wu,et al. Multidimensional scaling of systems in the Voice Conversion Challenge 2016 , 2016, SSW.
[14] Inma Hernáez,et al. Parametric Voice Conversion Based on Bilinear Frequency Warping Plus Amplitude Scaling , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[15] Tuomas Virtanen,et al. Exemplar-Based Sparse Representations for Noise Robust Automatic Speech Recognition , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[16] Masami Akamine,et al. Multilevel parametric-base F0 model for speech synthesis , 2008, INTERSPEECH.
[17] D. Crystal. Systems of prosodic and paralinguistic features in English / by David Crystall and Randolph Quirk , 1964 .
[18] Satoshi Nakamura,et al. Speaker adaptation and voice conversion by codebook mapping , 1991, 1991., IEEE International Sympoisum on Circuits and Systems.
[19] R. Patel,et al. Acoustic characteristics of the question-statement contrast in severe dysarthria due to cerebral palsy. , 2003, Journal of speech, language, and hearing research : JSLHR.
[20] Paul Taylor,et al. Text-to-Speech Synthesis , 2009 .
[21] Chung-Hsien Wu,et al. Voice conversion using duration-embedded bi-HMMs for expressive speech synthesis , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[22] Paavo Alku,et al. Wavelets for intonation modeling in HMM speech synthesis , 2013, SSW.
[23] Simon King,et al. Transforming F0 contours , 2003, INTERSPEECH.
[24] Tomoki Toda,et al. The Voice Conversion Challenge 2016 , 2016, INTERSPEECH.
[25] Moncef Gabbouj,et al. Voice Conversion Using Partial Least Squares Regression , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[26] Haizhou Li,et al. Fundamental frequency modeling using wavelets for emotional voice conversion , 2015, 2015 International Conference on Affective Computing and Intelligent Interaction (ACII).
[27] H. Sebastian Seung,et al. Algorithms for Non-negative Matrix Factorization , 2000, NIPS.
[28] Heiga Zen,et al. Probabilistic feature mapping based on trajectory HMMs , 2008, INTERSPEECH.
[29] Tomoki Toda,et al. Voice Conversion Based on Maximum-Likelihood Estimation of Spectral Parameter Trajectory , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[30] Martti Vainio,et al. Continuous wavelet transform for analysis of speech prosody , 2013 .
[31] Haizhou Li,et al. Deep Bidirectional LSTM Modeling of Timbre and Prosody for Emotional Voice Conversion , 2016, INTERSPEECH.
[32] Tetsuya Takiguchi,et al. Voice conversion based on Non-negative matrix factorization using phoneme-categorized dictionary , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[33] Haizhou Li,et al. Text-independent F0 transformation with non-parallel data for voice conversion , 2010, INTERSPEECH.
[34] Arthur R. Toth,et al. Incorporating durational modification in voice transformation , 2008, INTERSPEECH.
[35] Haizhou Li,et al. Exemplar-Based Sparse Representation With Residual Compensation for Voice Conversion , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[36] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[37] Moncef Gabbouj,et al. Voice Conversion Using Dynamic Kernel Partial Least Squares Regression , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[38] Yoshihiko Nankaku,et al. Simultaneous conversion of duration and spectrum based on statistical models including time-sequence matching , 2008, INTERSPEECH.
[39] Chng Eng Siong,et al. High quality voice conversion using prosodic and high-resolution spectral features , 2015, Multimedia Tools and Applications.