Multimodal Crowdsourcing for Transcribing Handwritten Documents
暂无分享,去创建一个
[1] Hermann Ney,et al. White-space models for offline Arabic handwriting recognition , 2008, 2008 19th International Conference on Pattern Recognition.
[2] José B. Mariño,et al. Albayzin speech database: design of the phonetic corpus , 1993, EUROSPEECH.
[3] W. Marsden. I and J , 2012 .
[4] Jerome R. Bellegarda,et al. Statistical language model adaptation: review and perspectives , 2004, Speech Commun..
[5] John H. L. Hansen,et al. Improved parcel sorting by combining automatic speech and character recognition , 2012, 2012 IEEE International Conference on Emerging Signal Processing Applications.
[6] Carlos D. Martínez-Hinarejos,et al. A Multimodal Crowdsourcing Framework for Transcribing Historical Handwritten Documents , 2016, DocEng.
[7] Tim Polzehl,et al. Crowdsourcing a Multi-lingual Speech Corpus: Recording, Transcription and Annotation of the CrowdIS Corpora , 2016, LREC.
[8] Maxine Eskénazi,et al. Speaking to the Crowd: Looking at Past Achievements in Using Crowdsourcing for Speech and Predicting Future Challenges , 2011, INTERSPEECH.
[9] Camino Vera. Combining Handwriting and Speech Recognition for Transcribing Historical Handwritten Documents , 2015 .
[10] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.
[11] Steve Young,et al. The HTK book , 1995 .
[12] Alfons Juan-Císcar,et al. The RODRIGO Database , 2010, LREC.
[13] Sargur N. Srihari,et al. On-Line and Off-Line Handwriting Recognition: A Comprehensive Survey , 2000, IEEE Trans. Pattern Anal. Mach. Intell..
[14] Carlos D. Martínez-Hinarejos,et al. Combining handwriting and speech recognition for transcribing historical handwritten documents , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).
[15] Yang Liu,et al. Using N-Best Lists and Confusion Networks for Meeting Summarization , 2011, IEEE Transactions on Audio, Speech, and Language Processing.
[16] Bernhard Rüber,et al. Obtaining confidence measures from sentence probabilities , 1997, EUROSPEECH.
[17] Hermann Ney,et al. Improved backing-off for M-gram language modeling , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.
[18] Antonio L. Lagarda,et al. A Multimodal Approach to Dictation of Handwritten Historical Documents , 2011, INTERSPEECH.
[19] Per Ola Kristensson,et al. Asynchronous Multimodal Text Entry Using Speech and Gesture Keyboards , 2011, INTERSPEECH.
[20] Hermann Ney,et al. Confidence measures for large vocabulary continuous speech recognition , 2001, IEEE Trans. Speech Audio Process..
[21] Kazuya Takeda,et al. Improvement of multimodal gesture and speech recognition performance using time intervals between gestures and accompanying speech , 2014, EURASIP Journal on Audio, Speech, and Music Processing.
[22] Timothy J. Hazen. Visual model structures and synchrony constraints for audio-visual speech recognition , 2006, IEEE Transactions on Audio, Speech, and Language Processing.
[23] Alon Y. Halevy,et al. Crowdsourcing systems on the World-Wide Web , 2011, Commun. ACM.
[24] Sadaoki Furui,et al. TOWARD ROBUST MULTIMODAL SPEECH RECOGNITION , 2005 .
[25] Jian Xue,et al. Improved confusion network algorithm and shortest path search from word lattice , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..
[26] Moisés Pastor,et al. iATROS: A SPEECH AND HANDWRITING RECOGNITION SYSTEM , 2008 .
[27] Hermann Ney,et al. Bootstrap estimates for confidence intervals in ASR performance evaluation , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.
[28] Andreas Stolcke,et al. SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.
[29] Carlos D. Martínez-Hinarejos,et al. Multimodal Output Combination for Transcribing Historical Handwritten Documents , 2015, CAIP.
[30] Biing-Hwang Juang,et al. Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.