Toward Constructing A Multilingual Speech Corpus for Taiwanese (Min-nan), Hakka, and Mandarin Chinese
暂无分享,去创建一个
[1] Francisco Javier Caminero Gil,et al. Discriminative training of GMM for speaker identification , 1996, ICASSP.
[2] B. Atal. Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification. , 1974, The Journal of the Acoustical Society of America.
[3] Tatsuya Kawahara,et al. Task adaptation using MAP estimation in N-gram language modeling , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[4] Clifford Nass,et al. The media equation - how people treat computers, television, and new media like real people and places , 1996 .
[5] Ching-Tang Hsieh,et al. Robust speech features based on wavelet transform with application to speaker identification , 2002 .
[6] Jerome R. Bellegarda. Large vocabulary speech recognition with multispan statistical language models , 2000, IEEE Trans. Speech Audio Process..
[7] Frederick Jelinek,et al. Interpolated estimation of Markov source parameters from sparse data , 1980 .
[8] Ruth F. Eisenberg. Talking to a machine , 1979 .
[9] M. Bradley,et al. Emotion, attention, and the startle reflex. , 1990, Psychological review.
[10] Hsin-Min Wang,et al. Eigenspace-based maximum a posteriori linear regression for rapid speaker adaptation , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[11] Jérôme Boudy,et al. Experiments with a nonlinear spectral subtractor (NSS), Hidden Markov models and the projection, for robust speech recognition in cars , 1991, Speech Commun..
[12] Alon Lavie,et al. Janus-III: speech-to-speech translation in multiple languages , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[13] Zhou Guodong,et al. Interpolation of n-gram and mutual-information based trigger pair language models for Mandarin speech recognition , 1999 .
[14] Steve Young,et al. The HTK book version 3.4 , 2006 .
[15] Jerome R. Bellegarda,et al. A statistical language modeling approach integrating local and global constraints , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.
[16] J.R. Bellegarda,et al. Exploiting latent semantic information in statistical language modeling , 2000, Proceedings of the IEEE.
[17] Slava M. Katz,et al. Estimation of probabilities from sparse data for the language model component of a speech recognizer , 1987, IEEE Trans. Acoust. Speech Signal Process..
[18] Ching-Tang Hsieh,et al. Robust Speaker Identification System Based on Wavelet Transform and Gaussian Mixture Model , 2003, J. Inf. Sci. Eng..
[19] Chung-Hsien Wu,et al. 台語多聲調音節合成單元資料庫暨文字轉語音雛形系統之發展 (Establish Taiwanese 7-Tones Syllable-based Synthesis Units Database for the Prototype Development of Text-To-Speech System) [In Chinese] , 1999, ROCLING.
[20] M. Abramowitz,et al. Handbook of Mathematical Functions With Formulas, Graphs and Mathematical Tables (National Bureau of Standards Applied Mathematics Series No. 55) , 1965 .
[21] Li Deng,et al. Recursive estimation of nonstationary noise using iterative stochastic approximation for robust speech recognition , 2003, IEEE Trans. Speech Audio Process..
[22] Bonnie J. Dorr,et al. Machine Translation: A View from the Lexicon , 1994, CL.
[23] S. Fukuda,et al. Extracting emotion from voice , 1999, IEEE SMC'99 Conference Proceedings. 1999 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No.99CH37028).
[24] Worldbet,et al. ASCII Phonetic Symbols for the World s Languages Worldbet , 1994 .
[25] Vladimir Lifschitz,et al. is stronger than , 1979 .
[26] ZU Yiqing,et al. A SUPER PHONETIC SYSTEM AND MULTI-DIALECT CHINESE SPEECH CORPUS FOR SPEECH RECOGNITION , 2002 .
[27] Gary S. Katz,et al. Bimodal expression of emotion by face and voice , 1998, MULTIMEDIA '98.
[28] Biing-Hwang Juang,et al. A vector quantization approach to speaker recognition , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.
[29] Aaron E. Rosenberg,et al. Speaker identification using minimum classification error training , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[30] Marcello Federico,et al. Bayesian estimation methods for n-gram language model adaptation , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.
[31] Michel Simard,et al. Translation Spotting for Translation Memories , 2003, ParallelTexts@NAACL-HLT.
[32] Ching-Tang Hsieh,et al. A Robust Speaker Identification System Based on Wavelet Transform , 2001 .
[33] Jean Véronis,et al. Parallel Text Processing , 2000 .
[34] Douglas D. O'Shaughnessy,et al. Generalized mel frequency cepstral coefficients for large-vocabulary speaker-independent continuous-speech recognition , 1999, IEEE Trans. Speech Audio Process..
[35] Aaron E. Rosenberg,et al. On the use of instantaneous and transitional spectral information in speaker recognition , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.
[36] Alexandros Potamianos,et al. Multi-band speech recognition in noisy environments , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).
[37] Ren-Yuan Lyu,et al. Automatic selection of phonetically distributed sentence sets for speaker adaptation with application to large vocabulary Mandarin speech recognition , 1999, Comput. Speech Lang..
[38] S. Furui,et al. Vector-quantization-based speech recognition and speaker recognition techniques , 1991, [1991] Conference Record of the Twenty-Fifth Asilomar Conference on Signals, Systems & Computers.
[39] Yasunari Yoshitomi,et al. Effect of sensor fusion for recognition of emotional states using voice, face image and thermal image of face , 2000, Proceedings 9th IEEE International Workshop on Robot and Human Interactive Communication. IEEE RO-MAN 2000 (Cat. No.00TH8499).
[40] Dustin Boswell,et al. Introduction to Support Vector Machines , 2002 .
[41] 01 New Aurora Activity for Standardization of a Front-End Extension for Tonal Language Recognition and Speech Reconstruction , 2001 .
[42] Frederick Jelinek,et al. Self-organizing language modeling for speech recognition , 1990 .
[43] Hermann Ney,et al. Speech-to-speech translation based on finite-state transducers , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[44] Chung-Hsien Wu,et al. Multi-keyword spotting of telephone speech using a fuzzy search algorithm and keyword-driven two-level CBSM , 2001, Speech Commun..
[45] I. Daubechies. Orthonormal bases of compactly supported wavelets , 1988 .
[46] Brendan J. Frey,et al. Towards non-stationary model-based noise adaptation for large vocabulary speech recognition , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[47] Richard A. Harshman,et al. Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..
[48] Richard J. Mammone,et al. Use of non-negative matrix factorization for language model adaptation in a lecture transcription task , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[49] A. B. Poritz,et al. Linear predictive hidden Markov models and the speech signal , 1982, ICASSP.
[50] Naftali Z. Tisby. On the application of mixture AR hidden Markov models to text independent speaker recognition , 1991, IEEE Trans. Signal Process..
[51] Ren-Yuan Lyu,et al. A bi-lingual Mandarin/taiwanese (min-nan), large vocabulary, continuous speech recognition system based on the tong-yong phonetic alphabet (TYPA) , 2000, INTERSPEECH.
[52] Wivun Taiffalo Chiung. Articles on Language Planning and Romanization : Romanization and Language Planning in Taiwan , 2001 .
[53] Chiyomi Miyajima,et al. Speaker identification using Gaussian mixture models based on multi-space probability distribution , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[54] Pero Subasic,et al. Affect analysis of text using fuzzy semantic typing , 2001, IEEE Trans. Fuzzy Syst..
[55] Jhing-Fa Wang,et al. 國語文句翻台語語音系統之研究 (A Study for Mandarin Text to Taiwanese speech System) [In Chinese] , 1999, ROCLING.
[56] Ian H. Witten,et al. The zero-frequency problem: Estimating the probabilities of novel events in adaptive text compression , 1991, IEEE Trans. Inf. Theory.
[57] Vassilios Digalakis,et al. Quantization of cepstral parameters for speech recognition over the World Wide Web , 1999, IEEE J. Sel. Areas Commun..
[58] Jont B. Allen,et al. How do humans process and recognize speech? , 1993, IEEE Trans. Speech Audio Process..
[59] Yuang-chin Chiang,et al. An efficient algorithm to select phonetically balanced scripts for constructing a speech corpus , 2003, International Conference on Natural Language Processing and Knowledge Engineering, 2003. Proceedings. 2003.
[60] Nikki Mirghafori,et al. Combining connectionist multi-band and full-band probability streams for speech recognition of natural numbers , 1998, ICSLP.
[61] Toshiyuki Takezawa,et al. End-to-end evaluation in ATR-MATRIX: speech translation system between English and Japanese , 1999, EUROSPEECH.
[62] Chin-Hui Lee,et al. On stochastic feature and model compensation approaches to robust speech recognition , 1998, Speech Commun..
[63] Guodong Zhou,et al. Interpolation of n-gram and mutual-information based trigger pair language models for Mandarin speech recognition , 1999, Comput. Speech Lang..
[64] Wolfgang Wahlster,et al. Verbmobil: Foundations of Speech-to-Speech Translation , 2000, Artificial Intelligence.
[65] Douglas A. Reynolds,et al. Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..
[66] Chung-Hsien Wu,et al. Emotion recognition from textual input using an emotional semantic network , 2002, INTERSPEECH.
[67] Ronald Rosenfeld,et al. Trigger-based language models: a maximum entropy approach , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[68] Ryosuke Isotani,et al. A Speech Translation System with Mobile Wireless Clients , 2003, ACL.
[69] Imre Kiss,et al. Noise robust speech parameterization using multiresolution feature extraction , 2001, IEEE Trans. Speech Audio Process..
[70] Jasha Droppo,et al. A noise-robust ASR front-end using Wiener filter constructed from MMSE estimation of clean speech and noise , 2003, 2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721).
[71] James Nga-Kwok Liu,et al. A hybrid model for Chinese-English machine translation , 1998, SMC'98 Conference Proceedings. 1998 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No.98CH36218).
[72] Susan T. Dumais,et al. Using Linear Algebra for Intelligent Information Retrieval , 1995, SIAM Rev..
[73] Takeshi Kawabata,et al. Back-off method for n-gram smoothing based on binomial posteriori distribution , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.
[74] J. Buck,et al. Text-dependent speaker recognition using vector quantization , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.
[75] Chin-Hui Lee,et al. A study on speaker adaptation of continuous density HMM parameters , 1990, International Conference on Acoustics, Speech, and Signal Processing.
[76] Parcor Coeff,et al. Comparison of Speaker Recognition Methods Using Statistical Features and Dynamic Features , 1981 .
[77] J.H.L. Hansen,et al. An efficient scoring algorithm for Gaussian mixture model based speaker identification , 1998, IEEE Signal Processing Letters.
[78] Kenji Suzuki,et al. The Humanization, Personalization and Authentication Issues in the Design of Interactive Service System , 2003, Trans. SDPS.
[79] Misha Pavel,et al. Towards ASR on partially corrupted speech , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.
[80] J. Véronis,et al. Evaluation of parallel text alignment systems The ARCADE project , 2000 .
[81] S. Furui,et al. Cepstral analysis technique for automatic speaker verification , 1981 .
[82] Hynek Hermansky,et al. Sub-band based recognition of noisy speech , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[83] Harry Shum,et al. Emotion Detection from Speech to Enrich Multimedia Content , 2001, IEEE Pacific Rim Conference on Multimedia.
[84] F ChenStanley,et al. An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.
[85] C. Ding. A similarity-based probability model for latent semantic indexing , 1999, SIGIR '99.
[86] Chiu-yu Tseng,et al. MAT-2000 - design, collection, and validation of a Mandarin 2000-speaker telephone speech database , 2000, INTERSPEECH.
[87] Michel Simard,et al. TransSearch: A Free Translation Memory on the World Wide Web , 2000, LREC.
[88] Xerox Corpora,et al. Speech Recognition Experiments with Linear Predication, Bandpass Filtering, and Dynamic Programming , 1975 .
[89] Hervé Bourlard,et al. A mew ASR approach based on independent processing and recombination of partial frequency bands , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.