Attelage de systèmes de transcription automatique de la parole
暂无分享,去创建一个
[1] Frederick Jelinek,et al. Improved clustering techniques for class-based statistical language modeling , 1999 .
[2] Gerald Friedland,et al. Opportunities and challenges of parallelizing speech recognition , 2010 .
[3] Georges Linarès,et al. System Combination by Driven Decoding , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.
[4] Mari Ostendorf,et al. Modeling long distance dependence in language: topic mixtures versus dynamic cache models , 1996, IEEE Trans. Speech Audio Process..
[5] Ananth Sankar. Bayesian model combination (BAYCOM) for improved recognition , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..
[6] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.
[7] Yannick Estève,et al. Systèmes de transcription automatique de la parole et logiciels libres , 2004 .
[8] Elmar Nöth,et al. Comparison and Combination of Confidence Measures , 2002, TSD.
[9] Hermann Ney,et al. Improved clustering techniques for class-based statistical language modelling , 1993, EUROSPEECH.
[10] Sylvain Meignier,et al. LIUM SPKDIARIZATION: AN OPEN SOURCE TOOLKIT FOR DIARIZATION , 2010 .
[11] Hynek Hermansky,et al. Perceptual Linear Predictive (PLP) Analysis-Resynthesis Technique , 1991, Final Program and Paper Summaries 1991 IEEE ASSP Workshop on Applications of Signal Processing to Audio and Acoustics.
[12] Richard M. Stern,et al. Speech in Noisy Environments: robust automatic segmentation, feature extraction, and hypothesis combination , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[13] T.H. Crystal,et al. Linear prediction of speech , 1977, Proceedings of the IEEE.
[14] Paul Deléglise,et al. Improvements to the LIUM French ASR system based on CMU sphinx: what helps to significantly reduce the word error rate? , 2009, INTERSPEECH.
[15] Mark J. F. Gales,et al. Generating Complementary Systems for Speech Recognition , 2022 .
[16] Richard M. Schwartz,et al. A compact model for speaker-adaptive training , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.
[17] Hermann Ney,et al. A comparison of two LVR search optimization techniques , 2002, INTERSPEECH.
[18] H. Ney,et al. Linear discriminant analysis for improved large vocabulary continuous speech recognition , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[19] Mark J. F. Gales,et al. Maximum likelihood linear transformations for HMM-based speech recognition , 1998, Comput. Speech Lang..
[20] Yannick Estève. Intégration de sources de connaissances pour la modélisation stochastique du langage appliquée à la parole continue dans un contexte de dialogue oral homme-machine , 2002 .
[21] Paul Deléglise,et al. Unsupervised model adaptation on targeted speech segments for LVCSR system combination , 2010, INTERSPEECH.
[22] Brian Kingsbury,et al. Constructing ensembles of ASR systems using randomized decision trees , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..
[23] Michael Riley,et al. Towards automatic closed captioning : low latency real time broadcast news transcription , 2002, INTERSPEECH.
[24] Mark J. F. Gales,et al. Progress in the CU-HTK broadcast news transcription system , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[25] Mark J. F. Gales,et al. Use of Gaussian selection in large vocabulary continuous speech recognition using HMMS , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.
[26] Ludek Müller,et al. Comparison of MFCC and PLP parameterizations in the speaker independent continuous speech recognition task , 2001, INTERSPEECH.
[27] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[28] Daniel P. W. Ellis. STREAM COMBINATION BEFORE AND/OR AFTER THE ACOUSTIC MODEL , 1999 .
[29] Chin-Hui Lee,et al. Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains , 1994, IEEE Trans. Speech Audio Process..
[30] Mark J. F. Gales,et al. Mean and variance adaptation within the MLLR framework , 1996, Comput. Speech Lang..
[31] Hermann Ney,et al. Improved backing-off for M-gram language modeling , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.
[32] Hermann Ney,et al. Frame based system combination and a comparison with weighted ROVER and CNC , 2006, INTERSPEECH.
[33] Paul Deléglise,et al. The LIUM speech transcription system: a CMU Sphinx III-based system for French broadcast news , 2005, INTERSPEECH.
[34] Gérard Chollet,et al. Vers le temps réel en transcription automatique de la parole grand vocabulaire , 2007 .
[35] Hermann Ney,et al. iROVER: Improving System Combination with Classification , 2007, NAACL.
[36] Hakan Erdogan,et al. Incremental on-line feature space MLLR adaptation for telephony speech recognition , 2002, INTERSPEECH.
[37] Pascale Sébillot,et al. Morphosyntactic processing of n-best lists for improved recognition and confidence measure computation , 2007, INTERSPEECH.
[38] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .
[39] Frédéric Béchet,et al. The EPAC Corpus: Manual and Automatic Annotations of Conversational Speech in French Broadcast News , 2010, LREC.
[40] Jonathan G. Fiscus,et al. Tools for the analysis of benchmark speech recognition tests , 1990, International Conference on Acoustics, Speech, and Signal Processing.
[41] Benjamin Lecouteux. Reconnaissance automatique de la parole guidée par des transcriptions a priori. (driven decoding for speech recognition system combination) , 2008 .
[42] Mari Ostendorf,et al. Integration of Diverse Recognition Methodologies Through Reevaluation of N-Best Sentence Hypotheses , 1991, HLT.
[43] Sebastian Stüker,et al. Cross-system adaptation and combination for continuous speech recognition: the influence of phoneme set and acoustic front-end , 2006, INTERSPEECH.
[44] Alex Acero,et al. Spoken Language Processing: A Guide to Theory, Algorithm and System Development , 2001 .
[45] Florian Metze,et al. Parallelization Strategies for a Dynamic Lexical Tree Decoder , 2011 .
[46] Andreas Stolcke,et al. Finding consensus among words: lattice-based word error minimization , 1999, EUROSPEECH.
[47] Ronald Rosenfeld,et al. Trigger-based language models: a maximum entropy approach , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[48] John E. Markel,et al. Linear Prediction of Speech , 1976, Communication and Cybernetics.
[49] Georg Heigold,et al. The RWTH 2007 TC-STAR evaluation system for european English and Spanish , 2007, INTERSPEECH.
[50] L MercerRobert,et al. Class-based n-gram models of natural language , 1992 .
[51] Michael Collins,et al. Trigger-Based Language Modeling using a Loss-Sensitive Perceptron Algorithm , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.
[52] Xiang Li,et al. Combining search spaces of heterogeneous recognizers for improved speech recogniton , 2002, INTERSPEECH.
[53] Hoirin Kim,et al. Compensating Acoustic Mismatch Using Class-Based Histogram Equalization for Robust Speech Recognition , 2007, EURASIP J. Adv. Signal Process..
[54] Frederick Jelinek,et al. Continuous speech recognition , 1977, SGAR.
[55] Xavier L. Aubert,et al. An overview of decoding techniques for large vocabulary continuous speech recognition , 2002, Comput. Speech Lang..
[56] Julie Mauclair. Mesures de confiance en traitement automatique de la parole et applications , 2006 .
[57] Guillaume Gravier,et al. Corpus description of the ESTER Evaluation Campaign for the Rich Transcription of French Broadcast News , 2004, LREC.
[58] Hermann Ney,et al. Language-model look-ahead for large vocabulary speech recognition , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.
[59] L. Baum,et al. Statistical Inference for Probabilistic Functions of Finite State Markov Chains , 1966 .
[60] Mark J. F. Gales,et al. Use of contexts in language model interpolation and adaptation , 2009, Comput. Speech Lang..
[61] William J. Byrne,et al. Lattice segmentation and support vector machines for large vocabulary continuous speech recognition , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..
[62] Mark J. F. Gales,et al. Directed decision trees for generating complementary systems , 2009, Speech Commun..
[63] Gunnar Evermann,et al. Posterior probability decoding, confidence estimation and system combination , 2000 .
[64] Sebastian Stüker,et al. Overview of the IWSLT 2011 evaluation campaign , 2011, IWSLT.
[65] Geoffrey Zweig,et al. Boosting Gaussian mixtures in an LVCSR system , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).
[66] Georges Linarès,et al. Avancées dans le domaine de la transcription automatique par décodage guidé (Improvements on driven decoding system combination) [in French] , 2012, JEP-TALN-RECITAL 2012.
[67] Georges Linarès,et al. Bag of n-gram driven decoding for LVCSR system harnessing , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.
[68] Jonathan G. Fiscus,et al. A post-processing system to yield reduced word error rates: Recognizer Output Voting Error Reduction (ROVER) , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.
[69] Paul Deléglise,et al. LIUM's systems for the IWSLT 2011 speech translation tasks , 2011, IWSLT.
[70] Hermann Ney,et al. Look-ahead techniques for fast beam search , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[71] Georges Linarès,et al. Low latency combination of parallelized single-pass LVCSR systems , 2012, INTERSPEECH.
[72] Vassilios Digalakis,et al. Speaker adaptation using combined transformation and Bayesian methods , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.
[73] Georges Linarès,et al. Imperfect transcript driven speech recognition , 2006, INTERSPEECH.
[74] Robert E. Schapire,et al. The Boosting Approach to Machine Learning An Overview , 2003 .
[75] Steve J. Young,et al. MMIE training of large vocabulary recognition systems , 1997, Speech Communication.
[76] Guy Perennou,et al. BDLEX lexical data and knowledge base of spoken and written French , 1987, ECST.
[77] Jean-Luc Gauvain,et al. Combining multiple speech recognizers using voting and language model information , 2000, INTERSPEECH.
[78] Philip C. Woodland,et al. Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models , 1995, Comput. Speech Lang..
[79] Katrin Kirchhoff. Combining articulatory and acoustic information for speech recognition in noisy and reverberant environments , 1998, ICSLP.
[80] Anne Rogers,et al. Parallel Speech Recognition , 2004, International Journal of Parallel Programming.
[81] Robert L. Mercer,et al. Class-Based n-gram Models of Natural Language , 1992, CL.
[82] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.
[83] Hermann Ney,et al. Acoustic feature combination for robust speech recognition , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..
[84] Lukás Burget. Measurement of Complementarity of Recognition Systems , 2004, TSD.
[85] Xiaodong Cui,et al. High-performance low-latency speech recognition via multi-layered feature streaming and fast Gaussian computation , 2008, INTERSPEECH.
[86] F ChenStanley,et al. An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.
[87] Andrew J. Viterbi,et al. Error bounds for convolutional codes and an asymptotically optimum decoding algorithm , 1967, IEEE Trans. Inf. Theory.
[88] I-Fan Chen,et al. A new framework for system combination based on integrated hypothesis space , 2006, INTERSPEECH.
[89] Hermann Ney,et al. Confidence measures for large vocabulary continuous speech recognition , 2001, IEEE Trans. Speech Audio Process..
[90] Georges Linarès,et al. Generalized driven decoding for speech recognition system combination , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.
[91] Mark J. F. Gales,et al. The generation and use of regression class trees for MLLR adaptation , 1996 .
[92] Mari Ostendorf,et al. Modeling long distance dependence in language: topic mixtures vs. dynamic cache models , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.
[93] Fernando Pereira,et al. Weighted finite-state transducers in speech recognition , 2002, Comput. Speech Lang..
[94] Frederick Jelinek,et al. Interpolated estimation of Markov source parameters from sparse data , 1980 .
[95] Mark J. F. Gales,et al. Improving LVCSR System Combination Using Neural Network Language Model Cross Adaptation , 2011, INTERSPEECH.
[96] Fethi Bougares,et al. Some recent research work at LIUM based on the use of CMU Sphinx , 2010 .
[97] Takehito Utsuro,et al. Combining outputs of multiple LVCSR models by machine learning , 2005, Systems and Computers in Japan.
[98] Stanley F. Chen,et al. An empirical study of smoothing techniques for language modeling , 1999 .
[99] F. Jelinek,et al. Perplexity—a measure of the difficulty of speech recognition tasks , 1977 .
[100] Renato De Mori,et al. A Cache-Based Natural Language Model for Speech Recognition , 1990, IEEE Trans. Pattern Anal. Mach. Intell..
[101] Mark J. F. Gales,et al. Language model cross adaptation for LVCSR system combination , 2013, Comput. Speech Lang..
[102] Andreas Stolcke,et al. SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.
[103] Nils J. Nilsson,et al. A Formal Basis for the Heuristic Determination of Minimum Cost Paths , 1968, IEEE Trans. Syst. Sci. Cybern..
[104] Alexander Seward. Low-latency incremental speech transcription in the synface project , 2003, INTERSPEECH.
[105] Stan Davis,et al. Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .
[106] Hermann Ney,et al. Dynamic programming search for continuous speech recognition , 1999, IEEE Signal Process. Mag..
[107] Olivier Galibert,et al. THE LIMSI 2006 TC-STAR TRANSCRIPTION SYSTEMS ⁄ , 2006 .