Prédiction de performance des systèmes de reconnaissance automatique de la parole à l’aide de réseaux de neurones convolutifs [Performance prediction of automatic speech recognition systems using convolutional neural networks]
暂无分享,去创建一个
Benjamin Lecouteux | Olivier Galibert | Laurent Besacier | Zied Elloumi | Laurent Besacier | Olivier Galibert | B. Lecouteux | Z. Elloumi
[1] Hynek Hermansky,et al. Predicting error rates for unknown data in automatic speech recognition , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[2] Yoon Kim,et al. Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.
[3] Sheryl R. Young,et al. Recognition Confidence Measures: Detection of Misrecognitions and Out- Of-Vocabulary Words , 1994 .
[4] Maurizio Omologo,et al. Boosted acoustic model learning and hypotheses rescoring on the CHiME-3 task , 2015, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU).
[5] Olivier Galibert,et al. The ETAPE corpus for the evaluation of speech-based TV content processing in the French language , 2012, LREC.
[6] Richard M. Schwartz,et al. Automatic Detection Of New Words In A Large Vocabulary Continuous Speech Recognition System , 1989, HLT.
[7] Tasha Nagamine,et al. Exploring how deep neural networks form phonemic categories , 2015, INTERSPEECH.
[8] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[9] Julien Pinquier,et al. Prédiction a priori de la qualité de la transcription automatique de la parole bruitée , 2018, XXXIIe Journées d’Études sur la Parole.
[10] Hynek Hermansky,et al. Mean temporal distance: Predicting ASR error from temporal properties of speech signal , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[11] Hamed Zamani,et al. Multitask Learning for Adaptive Quality Estimation of Automatically Transcribed Utterances , 2015, NAACL.
[12] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[13] José Guilherme Camargo de Souza,et al. FBK-UEdin Participation to the WMT13 Quality Estimation Shared Task , 2013, WMT@ACL.
[14] Wei Dai,et al. Very deep convolutional neural networks for raw waveforms , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[15] Li-Rong Dai,et al. LID-senone Extraction via Deep Neural Networks for End-to-End Language Identification , 2016, Odyssey.
[16] Xing Shi,et al. Does String-Based Neural MT Learn Source Syntax? , 2016, EMNLP.
[17] Guillaume Gravier,et al. The ESTER phase II evaluation campaign for the rich transcription of French broadcast news , 2005, INTERSPEECH.
[18] Olivier Galibert,et al. A presentation of the REPERE challenge , 2012, 2012 10th International Workshop on Content-Based Multimedia Indexing (CBMI).
[19] José Guilherme Camargo de Souza,et al. Quality Estimation for Automatic Speech Recognition , 2014, COLING.
[20] Wouter A. Dreschler,et al. ICRA Noises: Artificial Noise Signals with Speech-like Spectral and Temporal Properties for Hearing Instrument Assessment: Ruidos ICRA: Señates de ruido artificial con espectro similar al habla y propiedades temporales para pruebas de instrumentos auditivos , 2001 .
[21] Geoffrey E. Hinton,et al. Understanding how Deep Belief Networks perform acoustic modelling , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[22] Björn Schuller,et al. Opensmile: the munich versatile and fast open-source audio feature extractor , 2010, ACM Multimedia.
[23] Andreas Stolcke,et al. SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.
[24] Colin Raffel,et al. librosa: Audio and Music Signal Analysis in Python , 2015, SciPy.
[25] N. Meinshausen,et al. Stability selection , 2008, 0809.2932.
[26] José Guilherme Camargo de Souza,et al. TranscRater: a Tool for Automatic Speech Recognition Quality Estimation , 2016, ACL.
[27] Yonatan Belinkov,et al. Analyzing Hidden Representations in End-to-End Automatic Speech Recognition Systems , 2017, NIPS.
[28] Jason Weston,et al. Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..
[29] Daniele Falavigna,et al. Driving ROVER with Segment-based ASR Quality Estimation , 2015, ACL.
[30] Guy Perennou,et al. BDLEX: a lexicon for spoken and written french , 1998, LREC.
[31] Thomas Pellegrini,et al. Inferring Phonemic Classes from CNN Activation Maps Using Clustering Techniques , 2016, INTERSPEECH.
[32] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .
[33] Olivier Galibert,et al. Methodologies for the evaluation of speaker diarization and automatic speech recognition in the presence of overlapping speech , 2013, INTERSPEECH.
[34] Dimitri Palaz,et al. Convolutional Neural Networks-based continuous speech recognition using raw speech signal , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[35] Yonatan Belinkov,et al. Evaluating Layers of Representation in Neural Machine Translation on Part-of-Speech and Semantic Tagging Tasks , 2017, IJCNLP.
[36] Simon King,et al. Investigating gated recurrent neural networks for speech synthesis , 2016 .
[37] Pierre Geurts,et al. Extremely randomized trees , 2006, Machine Learning.
[38] Karol J. Piczak. Environmental sound classification with convolutional neural networks , 2015, 2015 IEEE 25th International Workshop on Machine Learning for Signal Processing (MLSP).
[39] Tara N. Sainath,et al. Learning the speech front-end with raw waveform CLDNNs , 2015, INTERSPEECH.