Phonetic alignment: speech synthesis-based vs. Viterbi-based
暂无分享,去创建一个
[1] Anthony J. Robinson,et al. An application of recurrent nets to phone probability estimation , 1994, IEEE Trans. Neural Networks.
[2] Jean-Marc Boite,et al. Context independent and context dependent hybrid HMM/ANN systems for vocabulary independent tasks , 1997, EUROSPEECH.
[3] Hynek Hermansky,et al. Integrating RASTA-PLP into speech recognition , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.
[4] Biing-Hwang Juang,et al. On the use of bandpass liftering in speech recognition , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.
[5] Michael Picheny,et al. Performance of the IBM large vocabulary continuous speech recognition system on the ARPA Wall Street Journal task , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.
[6] M. Eskenazi,et al. The French language database: Defining, planning, and recording a large database , 1984, ICASSP.
[7] Janet M. Baker,et al. The Design for the Wall Street Journal-based CSR Corpus , 1992, HLT.
[8] Horacio Franco,et al. Context-dependent connectionist probability estimation in a hybrid hidden Markov model-neural net speech recognition system , 1994, Comput. Speech Lang..
[9] Maurizio Omologo,et al. Automatic segmentation and labeling of speech based on Hidden Markov Models , 1993, Speech Commun..
[10] Li Lee,et al. A frequency warping approach to speaker normalization , 1998, IEEE Trans. Speech Audio Process..
[11] F. Jelinek,et al. Continuous speech recognition by statistical methods , 1976, Proceedings of the IEEE.
[12] Alan W. Black,et al. Unit selection in a concatenative speech synthesis system using a large speech database , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.
[13] Maxine Eskénazi,et al. BREF, a large vocabulary spoken corpus for French , 1991, EUROSPEECH.
[14] Thierry Dutoit,et al. The MBROLA project: towards a set of high quality speech synthesizers free of use for non commercial purposes , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.
[15] Piero Cosi,et al. A preliminary statistical evaluation of manual and automatic segmentation discrepancies , 1991, EUROSPEECH.
[16] L. Baum,et al. An inequality and associated maximization technique in statistical estimation of probabilistic functions of a Markov process , 1972 .
[17] Olivier Deroo,et al. Automatic detection and correction of pronunciation errors for foreign language learners: the demosthenes application , 1999, EUROSPEECH.
[18] Frank Fallside,et al. A recurrent error propagation network speech recognition system , 1991 .
[19] C. Myers,et al. A level building dynamic time warping algorithm for connected word recognition , 1981 .
[20] Victor Zue,et al. Speech database development at MIT: Timit and beyond , 1990, Speech Commun..
[21] Biing-Hwang Juang,et al. Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.
[22] Alan W. Black,et al. Diphone collection and synthesis , 2000, INTERSPEECH.
[23] Steve Young,et al. Spoken language systems technology workshop , 1995 .
[24] Mark Huckvale,et al. Improvements in Speech Synthesis , 2001 .
[25] S. M. Peeling,et al. The ARM continuous speech recognition system , 1990, International Conference on Acoustics, Speech, and Signal Processing.
[26] Bert Van Coile,et al. PROTRAN: a prosody transplantation tool for text-to-speech applications , 1994, ICSLP.
[27] Lawrence R. Rabiner,et al. Connected word recognition using a level building dynamic time warping algorithm , 1981, ICASSP.
[28] Steve Renals,et al. The 1994 Abbot hybrid connectionist-HMM large vocabulary recognition system. , 1995 .
[29] Victor Zue,et al. A procedure for automatic alignment of phonetic transcriptions with continuous speech , 1984, ICASSP.
[30] Colin W. Wightman,et al. The aligner: text to speech alignment using Markov models and a pronunciation dictionary , 1994, SSW.
[31] Hervé Bourlard,et al. Connectionist Speech Recognition: A Hybrid Approach , 1993 .
[32] Thierry Dutoit,et al. High-quality speech synthesis for phonetic speech segmentation , 1997, EUROSPEECH.
[33] Michael Riley,et al. Automatic segmentation and labeling of speech , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.
[34] P.C. Woodland,et al. The 1994 HTK large vocabulary speech recognition system , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.