On the analysis and evaluation of prosody conversion techniques
暂无分享,去创建一个
Haizhou Li | Kay Chen Tan | Grandee Lee | Berrak Sisman | Haizhou Li | K. Tan | Berrak Sisman | Grandee Lee
[1] Moncef Gabbouj,et al. Voice Conversion Using Partial Least Squares Regression , 2010, IEEE Transactions on Audio, Speech, and Language Processing.
[2] Jacob Benesty,et al. On the Importance of the Pearson Correlation Coefficient in Noise Reduction , 2008, IEEE Transactions on Audio, Speech, and Language Processing.
[3] Tetsuya Takiguchi,et al. Voice conversion based on Non-negative matrix factorization using phoneme-categorized dictionary , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[4] Haizhou Li,et al. Fundamental frequency modeling using wavelets for emotional voice conversion , 2015, 2015 International Conference on Affective Computing and Intelligent Interaction (ACII).
[5] Tomoki Toda,et al. The Voice Conversion Challenge 2016 , 2016, INTERSPEECH.
[6] Haizhou Li,et al. Text-independent F0 transformation with non-parallel data for voice conversion , 2010, INTERSPEECH.
[7] S. Furui,et al. Cepstral analysis technique for automatic speaker verification , 1981 .
[8] Donald J. Berndt,et al. Using Dynamic Time Warping to Find Patterns in Time Series , 1994, KDD Workshop.
[9] Haizhou Li,et al. Exemplar-based voice conversion using non-negative spectrogram deconvolution , 2013, SSW.
[10] Haizhou Li,et al. Exemplar-Based Sparse Representation With Residual Compensation for Voice Conversion , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[11] Stan Salvador,et al. FastDTW: Toward Accurate Dynamic Time Warping in Linear Time and Space , 2004 .
[12] Haizhou Li,et al. Transformation of prosody in voice conversion , 2017, 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC).
[13] Haizhou Li,et al. Sparse representation of phonetic features for voice conversion with and without parallel data , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).
[14] Eamonn J. Keogh,et al. Exact indexing of dynamic time warping , 2002, Knowledge and Information Systems.
[15] Yi Xu. SPEECH PROSODY : A METHODOLOGICAL REVIEW , 2011 .
[16] Haizhou Li,et al. An overview of text-independent speaker recognition: From features to supervectors , 2010, Speech Commun..
[17] Masami Akamine,et al. Multilevel parametric-base F0 model for speech synthesis , 2008, INTERSPEECH.
[18] G. Huttar. Relations between prosodic variables and emotions in normal American English utterances. , 1968, Journal of speech and hearing research.
[19] Paavo Alku,et al. Wavelets for intonation modeling in HMM speech synthesis , 2013, SSW.
[20] Simon King,et al. Transforming F0 contours , 2003, INTERSPEECH.
[21] Tomoki Toda,et al. Voice Conversion Based on Maximum-Likelihood Estimation of Spectral Parameter Trajectory , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[22] Martti Vainio,et al. Continuous wavelet transform for analysis of speech prosody , 2013 .
[23] B.-H. Juang,et al. On the hidden Markov model and dynamic time warping for speech recognition — A unified view , 1984, AT&T Bell Laboratories Technical Journal.
[24] Haizhou Li,et al. Exemplar-based sparse representation of timbre and prosody for voice conversion , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[25] Eamonn J. Keogh,et al. Scaling up dynamic time warping for datamining applications , 2000, KDD '00.
[26] Jacob Benesty,et al. Pearson Correlation Coefficient , 2009 .
[27] Bin Ma,et al. Spoken Language Recognition: From Fundamentals to Practice , 2013, Proceedings of the IEEE.
[28] Moncef Gabbouj,et al. Hierarchical modeling of F0 contours for voice conversion , 2014, INTERSPEECH.