Listeners' weighting of acoustic cues to synthetic speech naturalness: A multidimensional scaling analysis
暂无分享,去创建一个
[1] Margaret King,et al. Evaluation of natural language processing systems , 1991 .
[2] C. Mayo,et al. Adult-child differences in acoustic cue weighting are influenced by segmental context: children are not always perceptually biased toward transitions. , 2004, The Journal of the Acoustical Society of America.
[3] Heiga Zen,et al. Reformulating the HMM as a trajectory model by imposing explicit relationships between static and dynamic feature vector sequences , 2007, Comput. Speech Lang..
[4] Simon King,et al. Multidimensional scaling of listener responses to synthetic speech , 2005, INTERSPEECH.
[5] J L Hall. Application of multidimensional scaling to subjective evaluation of coded speech. , 2001, The Journal of the Acoustical Society of America.
[6] Valérie Hazan,et al. The development of phonemic categorization in children aged 6-12 , 2000, J. Phonetics.
[7] Mike Plumpe,et al. Which is more important in a concatenative text to speech system - pitch, duration, or spectral discontinuity? , 1998, SSW.
[8] Ann K. Syrdal,et al. Acceptability of variations in question intonation in natural and synthesized American English , 2004 .
[9] Andrew C. Simpson,et al. The effect of cue-enhancement on the intelligibility of nonsense word and sentence materials presented in noise , 1998, Speech Commun..
[10] Daniel Hirst,et al. Comparison of subjective evaluation and an objective evaluation metric for prosody in text-to-speech synthesis , 1998, SSW.
[11] Paul Taylor,et al. Festival Speech Synthesis System , 1998 .
[12] Jocelynne Watson. Sibilant vowel coarticulation in the perception of speech by children with phonological disorder , 1995 .
[13] James L. Morgan,et al. Signal to syntax : bootstrapping from speech to grammar in early acquisition , 1996 .
[14] Raymond N. J. Veldhuis,et al. Reducing audible spectral discontinuities , 2001, IEEE Trans. Speech Audio Process..
[15] Nick Campbell,et al. ISCA special session: hot topics in speech synthesis , 2003, INTERSPEECH.
[16] Susan Scollie,et al. Stimulus set effects in the similarity ratings of unfamiliar complex sounds. , 2002, The Journal of the Acoustical Society of America.
[17] J. Kreiman,et al. Sources of listener disagreement in voice quality assessment. , 2000, The Journal of the Acoustical Society of America.
[18] Colin W. Wightman,et al. Segmental durations in the vicinity of prosodic phrase boundaries. , 1992, The Journal of the Acoustical Society of America.
[19] Jody Kreiman,et al. Perceptual relevance of source spectral slope measures , 2004 .
[20] Michael W. Macon,et al. A perceptual evaluation of distance measures for concatenative speech synthesis , 1998, ICSLP.
[21] Jithendra Vepa. Join cost for unit selection speech synthesis , 2004 .
[22] D. Pisoni,et al. Effects of talker, rate, and amplitude variation on recognition memory for spoken words , 1999, Perception & psychophysics.
[23] Raymond N. J. Veldhuis,et al. On the reduction of concatenation artefacts in diphone synthesis , 1998, ICSLP.
[24] A. Cutler,et al. Mora or Phoneme? Further Evidence for Language-Specific Listening , 1994 .
[25] Y. Tohkura,et al. A perceptual interference account of acquisition difficulties for non-native phonemes , 2003, Cognition.
[26] S. Nittrouer. The role of temporal and dynamic signal components in the perception of syllable-final stop voicing by children and adults. , 2004, The Journal of the Acoustical Society of America.
[27] Graham K. Rand,et al. Quantitative Applications in the Social Sciences , 1983 .
[28] Yannis Stylianou,et al. Perceptual and objective detection of discontinuities in concatenative speech synthesis , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[29] Andrew C. Simpson,et al. Enhancement techniques to improve the intelligibility of consonants in noise : speaker and listener effects , 1998, ICSLP.
[30] Ann K. Syrdal. Phonetic effects on listener detection of vowel concatenation , 2001, INTERSPEECH.
[31] Stefan Sudhoff,et al. Methods in empirical prosody research , 2006 .
[32] Robert A. J. Clark,et al. Objective methods for evaluating synthetic intonation , 1999, EUROSPEECH.
[33] Simon King,et al. Multisyn: Open-domain unit selection for the Festival speech synthesis system , 2007, Speech Commun..
[34] Ann K. Syrdal,et al. Effects on TTS quality of methods of realizing natural prosodic variations , 2003 .
[35] C. Wardrip‐Fruin. The effect of signal degradation on the status of cues to voicing in utterance‐final stop consonants , 1985 .
[36] J. Kreiman,et al. When and why listeners disagree in voice quality assessment tasks. , 2007, The Journal of the Acoustical Society of America.
[37] Robert A. J. Clark. Modelling pitch accents for concept-to-speech synthesis. , 2003 .
[38] M. Vainio,et al. Effect of prosodic naturalness on segmental acceptability in synthetic speech , 2002, Proceedings of 2002 IEEE Workshop on Speech Synthesis, 2002..
[39] S. King,et al. Improving Instrumental Quality Prediction Performance for the Blizzard Challenge , 2008 .
[40] Matthias Jilka. Exploration of different types of intonational deviations in foreign-accented and synthesized speech , 2005, INTERSPEECH.
[41] J. Rueckl,et al. Attentional Modulation of the Phonetic Significance of Acoustic Cues , 1993, Cognitive Psychology.
[42] Catherine Mayo,et al. The influence of spectral distinctiveness on acoustic cue weighting in children's and adults' speech perception. , 2005, The Journal of the Acoustical Society of America.
[43] Alice Turk,et al. Acoustic segment durations in prosodic research: a practical guide , 2006 .
[44] A. de Cheveigné,et al. The dependency of timbre on fundamental frequency. , 2003, The Journal of the Acoustical Society of America.
[45] L E Humes,et al. Identification of multidimensional stimuli containing speech cues and the effects of training. , 1997, The Journal of the Acoustical Society of America.
[46] C R Rabinov,et al. Comparing reliability of perceptual ratings of roughness and acoustic measure of jitter. , 1995, Journal of speech and hearing research.
[47] Sebastian Möller,et al. Quality prediction for synthesized speech: Comparison of approaches , 2009 .
[48] Catherine T. Best,et al. Perceptual equivalence of acoustic cues in speech and nonspeech perception , 1981, Perception & psychophysics.
[49] J Kreiman,et al. Validity of rating scale measures of voice quality. , 1998, The Journal of the Acoustical Society of America.
[50] M. Aldenderfer,et al. Cluster Analysis. Sage University Paper Series On Quantitative Applications in the Social Sciences 07-044 , 1984 .
[51] P Allen,et al. Multidimensional scaling of complex sounds by school-aged children and adults. , 1997, The Journal of the Acoustical Society of America.
[52] Abeer Alwan,et al. Text to Speech Synthesis: New Paradigms and Advances , 2004 .
[53] Carolyn Wardrip–Fruin,et al. On the status of temporal cues to phonetic categories: Preceding vowel duration as a cue to voicing in final stop consonants , 1982 .
[54] Paul Iverson,et al. Phonetic training with acoustic cue manipulations: a comparison of methods for teaching English /r/-/l/ to Japanese adults. , 2005, The Journal of the Acoustical Society of America.
[55] Alexander L. Francis,et al. Cue-specific effects of categorization training on the relative weighting of acoustic cues to consonant voicing in English. , 2008, The Journal of the Acoustical Society of America.
[56] Nick Campbell,et al. Objective distance measures for assessing concatenative speech synthesis , 1999, EUROSPEECH.
[57] J. Pind. The Discovery of Spoken Language, Peter W. Jusczyk (Ed.). MIT Press (1997), ISBN 0 262 10058 4 , 1997 .