The impact of speech recognition on speech synthesis
暂无分享,去创建一个
[1] Vassilios Diakoloukas,et al. Maximum-likelihood stochastic-transformation adaptation of hidden Markov models , 1999, IEEE Trans. Speech Audio Process..
[2] Jeff A. Bilmes,et al. Buried Markov models for speech recognition , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).
[3] Ellen Eide. Automatic modeling of pronunciation variations , 1999, EUROSPEECH.
[4] Jan P. H. van Santen,et al. Assignment of segmental duration in text-to-speech synthesis , 1994, Comput. Speech Lang..
[5] Mari Ostendorf,et al. HMM topology design using maximum likelihood successive state splitting , 1997, Comput. Speech Lang..
[6] Yannis Stylianou,et al. Perceptual and objective detection of discontinuities in concatenative speech synthesis , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[7] Alex Acero,et al. Formant analysis and synthesis using hidden Markov models , 1999, EUROSPEECH.
[8] Levent M. Arslan,et al. Voice conversion by codebook mapping of line spectral frequencies and excitation spectrum , 1997, EUROSPEECH.
[9] Mari Ostendorf,et al. Use of higher level linguistic structure in acoustic modeling for speech recognition , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).
[10] Keiichi Tokuda,et al. An algorithm for speech parameter generation from continuous mixture HMMs with dynamic features , 1995, EUROSPEECH.
[11] Frederick Jelinek,et al. Statistical methods for speech recognition , 1997 .
[12] Philip C. Woodland,et al. A hidden Markov-model-based trainable speech synthesizer , 1999, Comput. Speech Lang..
[13] Richard Wright,et al. Prosody and phonetic variability: Lessons learned from acoustic model clustering , 2003 .
[14] Mari Ostendorf,et al. Prediction of abstract prosodic labels for speech synthesis , 1996, Comput. Speech Lang..
[15] Raymond N. J. Veldhuis,et al. Reducing audible spectral discontinuities , 2001, IEEE Trans. Speech Audio Process..
[16] Elizabeth Shriberg,et al. Using prosodic and lexical information for speaker identification , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[17] Gérard Bailly,et al. Synthesising attitudes with global rhythmic and intonation contours , 1997, EUROSPEECH.
[18] Satoshi Nakamura,et al. Voice conversion through vector quantization , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.
[19] Yannis Stylianou,et al. Applying the harmonic plus noise model in concatenative speech synthesis , 2001, IEEE Trans. Speech Audio Process..
[20] Philip C. Woodland,et al. Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models , 1995, Comput. Speech Lang..
[21] Vassilios Digalakis,et al. Genones: generalized mixture tying in continuous hidden Markov model-based speech recognizers , 1996, IEEE Trans. Speech Audio Process..
[22] Alan W. Black,et al. Unit selection in a concatenative speech synthesis system using a large speech database , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.
[23] Mari Ostendorf,et al. Joint prosody prediction and unit selection for concatenative speech synthesis , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[24] Hervé Bourlard,et al. Neural networks for statistical recognition of continuous speech , 1995, Proc. IEEE.
[25] Julia Hirschberg,et al. Pitch Accent in Context: Predicting Intonational Prominence from Text , 1993, Artif. Intell..
[26] Eric Moulines,et al. Continuous probabilistic transform for voice conversion , 1998, IEEE Trans. Speech Audio Process..
[27] Alex Acero,et al. Whistler: a trainable text-to-speech system , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.
[28] Robert E. Donovan,et al. A new distance measure for costing spectral discontinuities in concatenative speech synthesizers , 2001, SSW.
[29] Mari Ostendorf,et al. Text normalization with varied data sources for conversational speech language modeling , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[30] Mari Ostendorf,et al. Efficient integrated response generation from multiple targets using weighted finite state transducers , 2002, Comput. Speech Lang..
[31] John H. L. Hansen,et al. Enhancement, segmentation, and synthesis of speech with application to robust speaker recognition , 1998 .
[32] Richard Sproat,et al. High-accuracy automatic segmentation , 1999, EUROSPEECH.
[33] Klaus A J Riederer. 1 LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION , 2000 .
[34] Mari Ostendorf,et al. From HMM's to segment models: a unified view of stochastic modeling for speech recognition , 1996, IEEE Trans. Speech Audio Process..
[35] Daniel Povey,et al. Large scale discriminative training of hidden Markov models for speech recognition , 2002, Comput. Speech Lang..
[36] Alex Acero,et al. HMM-based smoothing for concatenative speech synthesis , 1998, ICSLP.
[37] Jerome R. Bellegarda,et al. Statistical prosodic modeling: from corpus design to parameter estimation , 2001, IEEE Trans. Speech Audio Process..
[38] Jeff A. Bilmes,et al. Robust splicing costs and efficient search with BMM Models for concatenative speech synthesis , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[39] Elmar Nöth,et al. Whence and Whither Prosody in Automatic Speech Understanding: A Case Study , 2002 .
[40] Alexander Kain,et al. Spectral voice conversion for text-to-speech synthesis , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[41] Marc C. Beutnagel,et al. The AT & T NEXT-GEN TTS system , 1999 .
[42] Julia Hirschberg,et al. Automatic classification of intonational phrase boundaries , 1992 .
[43] Keiichi Tokuda,et al. Adaptation of pitch and spectrum for HMM-based speech synthesis using MLLR , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[44] R. Rosenfeld,et al. Two decades of statistical language modeling: where do we go from here? , 2000, Proceedings of the IEEE.
[45] Mari Ostendorf,et al. Unit selection for speech synthesis using splicing costs with weighted finite state transducers , 2001, INTERSPEECH.
[46] Paul Taylor,et al. Automatically clustering similar units for unit selection in speech synthesis , 1997, EUROSPEECH.
[47] Darragh O'Brien,et al. Concatenative synthesis based on a harmonic model , 2001, IEEE Trans. Speech Audio Process..
[48] P Taylor,et al. Analysis and synthesis of intonation using the Tilt model. , 2000, The Journal of the Acoustical Society of America.
[49] Harriet J. Nock,et al. Pronunciation modeling by sharing gaussian densities across phonetic models , 1999, EUROSPEECH.
[50] Mari Ostendorf,et al. Flexible speech synthesis using weighted finite-state transducers , 2002 .
[51] Keiichi Tokuda,et al. Speech parameter generation algorithms for HMM-based speech synthesis , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).
[52] Alex Acero,et al. Automatic generation of synthesis units for trainable text-to-speech systems , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[53] Joshua Goodman,et al. A bit of progress in language modeling , 2001, Comput. Speech Lang..
[54] Sebastian Ohnewald,et al. Speech synthesis using stochastic Markov graphs , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[55] Jerome R. Bellegarda,et al. Smooth contour estimation in data-driven pitch modelling , 2001, INTERSPEECH.
[56] W. Chou. Discriminant-function-based minimum recognition error rate pattern-recognition approach to speech recognition , 2000, Proc. IEEE.
[57] Li Deng,et al. A dynamic, feature-based approach to the interface between phonology and phonetics for speech modeling and recognition , 1998, Speech Commun..
[58] Alex Acero,et al. Spoken Language Processing: A Guide to Theory, Algorithm and System Development , 2001 .
[59] Steve Young,et al. A review of large-vocabulary continuous-speech , 1996, IEEE Signal Process. Mag..
[60] Hermann Ney,et al. A word graph algorithm for large vocabulary continuous speech recognition , 1994, Comput. Speech Lang..
[61] Yoshinori Sagisaka,et al. ATR μ-talk speech synthesis system , 1992, ICSLP.
[62] Sadaoki Furui,et al. Research of individuality features in speech waves and automatic speaker recognition techniques , 1986, Speech Commun..
[63] Keiichi Tokuda,et al. Speaker adaptation for HMM-based speech synthesis system using MLLR , 1998, SSW.
[64] Fernando Pereira,et al. Weighted finite-state transducers in speech recognition , 2002, Comput. Speech Lang..
[65] Mehryar Mohri,et al. Rapid unit selection from a large speech corpus for concatenative speech synthesis , 1999, EUROSPEECH.
[66] Hermann Ney,et al. Progress in dynamic programming search for LVCSR , 2000 .
[67] Steve J. Young,et al. State clustering in hidden Markov model-based continuous speech recognition , 1994, Comput. Speech Lang..
[68] Michael W. Macon,et al. Control of spectral dynamics in concatenative speech synthesis , 2001, IEEE Trans. Speech Audio Process..
[69] Shankar Kumar,et al. Normalization of non-standard words , 2001, Comput. Speech Lang..
[70] Paul Taylor,et al. Assigning phrase breaks from part-of-speech sequences , 1997, Comput. Speech Lang..
[71] Mari Ostendorf,et al. A dynamical system model for generating fundamental frequency for speech synthesis , 1999, IEEE Trans. Speech Audio Process..
[72] James R. Glass,et al. Natural-sounding speech synthesis using variable-length units , 1998, ICSLP.
[73] W.J.J. Roberts,et al. Automatic speaker recognition using Gaussian mixture models , 1999, 1999 Information, Decision and Control. Data and Information Fusion Symposium, Signal Processing and Communications Symposium and Decision and Control Symposium. Proceedings (Cat. No.99EX251).
[74] Michael W. Macon,et al. Optimized stopping criteria for tree-based unit selection in concatenative synthesis , 1998, ICSLP.
[75] Robert E. Donovan. Segment pre-selection in decision-tree based speech synthesis systems , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).
[76] Keiichi Tokuda,et al. Hidden Markov models based on multi-space probability distribution for pitch pattern modeling , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).
[77] F ChenStanley,et al. An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.
[78] Mari Ostendorf,et al. Prosody prediction for speech synthesis using transformational rule-based learning , 1998, ICSLP.