A uniform phase representation for the harmonic model in speech synthesis applications
暂无分享,去创建一个
[1] Axel Röbel,et al. Phase Minimization for Glottal Model Estimation , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[2] Yannis Stylianou,et al. Time-scale modifications based on a full-band adaptive harmonic model , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[3] Alan W. Black,et al. The CMU Arctic speech databases , 2004, SSW.
[4] Nathalie Henrich Bernardoni,et al. The spectrum of glottal flow models , 2006 .
[5] Christophe d'Alessandro,et al. Analysis/synthesis and modification of the speech aperiodic component , 1996, Speech Commun..
[6] P. Sprent,et al. Statistical Analysis of Circular Data. , 1994 .
[7] F. Itakura,et al. The effect of group delay spectrum on timbre , 2002 .
[8] Jordi Bonada. HIGH QUALITY VOICE TRANSFORMATIONS BASED ON MODELING RADIATED VOICE PULSES IN FREQUENCY DOMAIN , 2004 .
[9] Keiichi Tokuda,et al. Multi-Space Probability Distribution HMM , 2002 .
[10] Yamato Ohtani,et al. HMM-based speech synthesis using sub-band basis spectrum model , 2012, INTERSPEECH.
[11] Gilles Degottex,et al. Usual voice quality features and glottal features for emotional valence detection , 2012 .
[12] Rainer Martin,et al. Phase estimation for signal reconstruction in single-channel source separation , 2012, INTERSPEECH.
[13] Bayya Yegnanarayana,et al. Determination of instants of significant excitation in speech using group delay function , 1995, IEEE Trans. Speech Audio Process..
[14] Amro El-Jaroudi,et al. Discrete all-pole modeling , 1991, IEEE Trans. Signal Process..
[15] Kuldip K. Paliwal,et al. Product of power spectrum and group delay function for speech recognition , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[16] Simon King,et al. Estimation of voice source and vocal tract characteristics based on multi-frame analysis , 2003, INTERSPEECH.
[17] Villy Hansen,et al. On Aural Phase Detection: Part 1 , 1974 .
[18] Mark J. F. Gales,et al. The Application of Hidden Markov Models in Speech Recognition , 2007, Found. Trends Signal Process..
[19] Thomas F. Quatieri,et al. Shape invariant time-scale and pitch modification of speech , 1992, IEEE Trans. Signal Process..
[20] Mike E. Davies,et al. IEEE International Conference on Acoustics Speech and Signal Processing , 2008 .
[21] Hideki Kawahara,et al. Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds , 1999, Speech Commun..
[22] Thierry Dutoit,et al. Phase-based information for voice pathology detection , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[23] Yannis Stylianou,et al. Analysis and Synthesis of Speech Using an Adaptive Full-Band Harmonic Model , 2013, IEEE Transactions on Audio, Speech, and Language Processing.
[24] B. Yegnanarayana,et al. Epoch extraction from linear prediction residual for identification of closed glottis interval , 1979 .
[25] Nicholas I. Fisher,et al. Statistical Analysis of Circular Data , 1993 .
[26] R. Miller. Nature of the Vocal Cord Wave , 1956 .
[27] Inma Hernáez,et al. HNM-based MFCC+F0 extractor applied to statistical speech synthesis , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[28] R. J. McAulay,et al. Speech transformations based on a sinusoidal representation , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.
[29] A. Oppenheim,et al. Nonlinear filtering of multiplied and convolved signals , 1968 .
[30] Pejman Mowlaee,et al. Iterative Closed-Loop Phase-Aware Single-Channel Speech Enhancement , 2013, IEEE Signal Processing Letters.
[31] Inma Hernáez,et al. Harmonics Plus Noise Model Based Vocoder for Statistical Parametric Speech Synthesis , 2014, IEEE Journal of Selected Topics in Signal Processing.
[32] Yannis Stylianou,et al. Evaluating the intelligibility benefit of speech modifications in known noise conditions , 2013, Speech Commun..
[33] Hideki Kawahara,et al. Auditory Adaptation in Voice Perception , 2008, Current Biology.
[34] Yamato Ohtani,et al. Continuous F0 in the source-excitation generation for HMM-based TTS: Do we need voiced/unvoiced classification? , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[35] A. Oppenheim,et al. Nonlinear filtering of multiplied and convolved signals , 1968 .
[36] Andreas Spanias,et al. Speech coding: a tutorial review , 1994, Proc. IEEE.
[37] Kai Yu,et al. Continuous F0 Modeling for HMM Based Statistical Parametric Speech Synthesis , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[38] Christophe d'Alessandro,et al. Zeros of z-transform (ZZT) decomposition of speech for source-tract separation , 2004, INTERSPEECH.
[39] Haizhou Li,et al. An overview of text-independent speaker recognition: From features to supervectors , 2010, Speech Commun..
[40] B. Yegnanarayana,et al. Significance of group delay functions in signal reconstruction from spectral magnitude or phase , 1984 .
[41] Jean Laroche,et al. Improved phase vocoder time-scale modification of audio , 1999, IEEE Trans. Speech Audio Process..
[42] Christophe d'Alessandro,et al. The voice source as a causal/anticausal linear filter , 2003 .
[43] Thomas F. Quatieri,et al. Sine-wave phase coding at low data rates , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.
[44] Heiga Zen,et al. Details of the Nitech HMM-Based Speech Synthesis System for the Blizzard Challenge 2005 , 2007, IEICE Trans. Inf. Syst..
[45] Heiga Zen,et al. The HMM-based speech synthesis system (HTS) version 2.0 , 2007, SSW.
[46] Bayya Yegnanarayana,et al. Speech processing using group delay functions , 1991, Signal Process..
[47] Axel Röbel,et al. Mixed source model and its adapted vocal tract filter estimate for voice transformation and synthesis , 2013, Speech Commun..
[48] Axel Röbel,et al. Function of Phase-Distortion for glottal model estimation , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[49] Eric Moulines,et al. Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones , 1989, Speech Commun..
[50] Yannis Stylianou,et al. Adaptive AM–FM Signal Decomposition With Application to Speech Analysis , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[51] Eric Moulines,et al. Estimation of the spectral envelope of voiced sounds using a penalized likelihood approach , 2001, IEEE Trans. Speech Audio Process..
[52] Thomas F. Quatieri,et al. Phase coherence in speech reconstruction for enhancement and coding applications , 1989, International Conference on Acoustics, Speech, and Signal Processing,.
[53] Ibon Saratxaga,et al. Perceptual Importance of the Phase Related Information in Speech , 2012, INTERSPEECH.
[54] Thierry Dutoit,et al. Complex cepstrum-based decomposition of speech for glottal source estimation , 2009, INTERSPEECH.
[55] Yannis Stylianou. Removing linear phase mismatches in concatenative speech synthesis , 2001, IEEE Trans. Speech Audio Process..
[56] Hideki Kawahara,et al. Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system STRAIGHT , 2001, MAVEBA.
[57] Thomas F. Quatieri,et al. Speech analysis/Synthesis based on a sinusoidal representation , 1986, IEEE Trans. Acoust. Speech Signal Process..
[58] Mike Brookes,et al. Estimation of Glottal Closure Instants in Voiced Speech Using the DYPSA Algorithm , 2007, IEEE Transactions on Audio, Speech, and Language Processing.
[59] H. Saunders,et al. Digital Signal Processing (2nd Edition) , 1988 .
[60] Stan Davis,et al. Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .
[61] Alan V. Oppenheim,et al. Digital Signal Processing , 1978, IEEE Transactions on Systems, Man, and Cybernetics.
[62] Eric Moulines,et al. A diphone synthesis system based on time-domain prosodic modifications of speech , 1989, International Conference on Acoustics, Speech, and Signal Processing,.
[63] R. McAulay,et al. "Multirate sinusoidal transform coding at rates from 2.4 kbps to 8 kbps" , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.
[64] Ibon Saratxaga,et al. Detection of synthetic speech for the problem of imposture , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[65] Yannis Stylianou,et al. Wrapped Gaussian Mixture Models for Modeling and High-Rate Quantization of Phase Data of Speech , 2009, IEEE Transactions on Audio, Speech, and Language Processing.
[66] John Vanderkooy,et al. On the Audibility of Midrange Phase Distortion in Audio Systems , 1980 .
[67] Mark J. F. Gales,et al. Complex cepstrum for statistical parametric speech synthesis , 2013, Speech Commun..
[68] Akihiko Sugiyama,et al. Phase randomization - A new paradigm for single-channel signal enhancement , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[69] Eric Moulines,et al. HNS: Speech modification based on a harmonic+noise model , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[70] D. Paul. The spectral envelope estimation vocoder , 1981 .
[71] I. Saratxaga,et al. Simple representation of signal phase for harmonic speech models , 2009 .
[72] Yannis Stylianou,et al. Harmonic plus noise models for speech, combined with statistical methods, for speech and speaker modification , 1996 .
[73] Yannis Stylianou,et al. Pitch modifications of speech based on an adaptive Harmonic Model , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[74] Heiga Zen,et al. Speech Synthesis Based on Hidden Markov Models , 2013, Proceedings of the IEEE.
[75] Jon Sánchez,et al. Versatile Speech Databases for High Quality Synthesis for Basque , 2012, LREC.
[76] Xavier Rodet,et al. A HMM-based speech synthesis system using a new glottal source and vocal-tract separation method , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.
[77] Nicholas W. D. Evans,et al. Speaker Diarization: A Review of Recent Research , 2010, IEEE Transactions on Audio, Speech, and Language Processing.